Giter VIP home page Giter VIP logo

openbiosim / biosimspace Goto Github PK

View Code? Open in Web Editor NEW
70.0 6.0 10.0 39.76 MB

An interoperable Python framework for biomolecular simulation.

Home Page: https://biosimspace.openbiosim.org

License: GNU General Public License v3.0

Python 95.12% Shell 0.03% Jupyter Notebook 4.42% TeX 0.18% C++ 0.26% Batchfile 0.01%
computational-biology computational-chemistry computational-physics drug-discovery interoperability molecular-dynamics reproducibility reproducible-research molecular-simulation biomolecular-simulation

biosimspace's People

Contributors

adelehardie avatar akalpokas avatar annamherz avatar chryswoods avatar fjclark avatar janjoswig avatar jenkescheen avatar jmichel80 avatar kexul avatar lohedges avatar mb2055 avatar msuruzhon avatar ppxasjsm avatar ptosco avatar xiki-tempula avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

biosimspace's Issues

[BUG] Crystallographic waters could be removed during solvation

I came across this potential problem when reading this issue thread. For OpenMM, existing crystallographic waters are (now) correctly preserved when adding ions. There is no logic do deal with this when using gmx genion, other than specifying the minimum distance between ions and non-solvent with -rmin, which doesn't seem to work very well. Although I imagine that this issue would rarely occur, we might need to add some logic to check which waters were replaced, and re-run if any crystallographic waters were removed. (Alternatively, we could just switch to using openmm.Modeller if it's more reliable.)

[BUG] BSS IO flipping the box angle

Describe the bug
I have an amber.crd file. If I use cpptraj to convert it to RST7 file, I got the angle
120.0000229 120.0000229 90.0000000
If I use BSS to read in the file and write it out as RST7 file, I got the angle
59.9999771 59.9999771 90.0000000

To Reproduce
To generate the rst7 from cpptraj

cpptraj -p amber.prm7
trajin amber.crd
trajout amber_out.rst7
run

To generate the rst7 from BSS

import BioSimSpace.Sandpit.Exscientia as BSS
top = BSS.IO.readMolecules(['amber.prm7', 'amber.crd'])
BSS.IO.saveMolecules('BSS_out', top, 'RST7')

Expected behavior
They are the same.

Input files
Archive.zip

Default SOMD Options Give Extremely Poor Performance

Hi,

The default SOMD n_cycles and nmoves options are resulting in very poor performance for me. Running the script shown here with the GROMACS process replaced with the SOMD process results in the following simfile options:

save coordinates = True
ncycles = 500
nmoves = 100
ncycles_per_snap = 1
buffered coordinates frequency = 0
timestep = 2.00 femtosecond
reaction field dielectric = 78.3
cutoff type = cutoffperiodic
cutoff distance = 8 angstrom
barostat = True
pressure = 1.00000 atm
thermostat = True
temperature = 300.00 kelvin
gpu = 0

As a result of the extremely large number of cycles for few moves, my GPU utilisation drops from ~ 90 % (with nmoves = 50000 and ncycles=1) to ~ 4 % and simulations become extremely slow. Could the number of cycles please be reduced to avoid this performance hit? Thanks.

Failure to Write rst7 files

Hi,

This is likely more of a sire issue, but I'm getting some strange behaviour when trying to save rst files. When I run the following script:

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
os.environ["BSS_VERBOSE_ERRORS"] = "1"
import BioSimSpace as BSS

system = BSS.IO.readMolecules(["free_preequil.prm7", "free_preequil.rst7"])

protocol = BSS.Protocol.Production( runtime=100*BSS.Units.Time.picosecond)

#process = BSS.Process.Gromacs(system, protocol, work_dir="output")
# The SOMD process below runs extremely slowly due to very low n_moves
process = BSS.Process.Somd(system, protocol, platform="CUDA",  work_dir="output", extra_options={"n_moves": 50000, "n_cycles":1})
process.start()
process.wait()
# If block==True, then saving to the rst7 (but not prm7) file will fail
final_system = process.getSystem(block=False)

if process.isError():
    print(process.stdout())
    print(process.stderr())
    raise BSS._Exceptions.ThirdPartyError("The process failed.")
if system is None:
    print(process.stdout())
    print(process.stderr())
    raise BSS._Exceptions.ThirdPartyError("The process failed.")

print(system)
BSS.IO.saveMolecules(f"output_sys", final_system, fileformat=["rst7", "prm7"])

I consistently get the following error:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/finlayclark/software/devel/BioSimSpace_own/python/BioSimSpace/IO/_io.py:723 in             │
│ saveMolecules                                                                                    │
│                                                                                                  │
│    720 │   │   │   if format == "PRM7" or format == "RST7":                                      │
│    721 │   │   │   │   system_copy = system.copy()                                               │
│    722 │   │   │   │   system_copy._set_water_topology("AMBER", _property_map)                   │
│ ❱  723 │   │   │   │   file = _SireIO.MoleculeParser.save(                                       │
│    724 │   │   │   │   │   system_copy._sire_object, filebase, _property_map                     │
│    725 │   │   │   │   )                                                                         │
│    726 │   │   │   elif format == "GroTop":                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: SireError::io_error: Cannot write the (perhaps some of the ) files as the following errors occurred:
Failed to write the file '/home/finlayclark/Documents/research/adaptive_sampling/full_run_test/test_getSystem/output_sys.rst7' using the parser for 
fileformat 'RST7'. Errors reported by the parser are:
Errors converting the system to a Amber Rst7 format...
Could not write the float at index 12149, value '-1072.48' into the specified format AmberFormat( 6 x float[width = 12, precision = 7] ). (call 
sire.error.get_last_error_details() for more info)

The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/finlayclark/Documents/research/adaptive_sampling/full_run_test/test_getSystem/test_getSyst │
│ em.py:28 in <module>                                                                             │
│                                                                                                  │
│   25 │   raise BSS._Exceptions.ThirdPartyError("The process failed.")                            │
│   26                                                                                             │
│   27 print(system)                                                                               │
│ ❱ 28 BSS.IO.saveMolecules(f"output_sys", final_system, fileformat=["rst7", "prm7"])              │
│   29                                                                                             │
│                                                                                                  │
│ /home/finlayclark/software/devel/BioSimSpace_own/python/BioSimSpace/IO/_io.py:759 in             │
│ saveMolecules                                                                                    │
│                                                                                                  │
│    756 │   │   except Exception as e:                                                            │
│    757 │   │   │   msg = "Failed to save system to format: '%s'" % format                        │
│    758 │   │   │   if _isVerbose():                                                              │
│ ❱  759 │   │   │   │   raise IOError(msg) from e                                                 │
│    760 │   │   │   else:                                                                         │
│    761 │   │   │   │   raise IOError(msg) from None                                              │
│    762                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: Failed to save system to format: 'RST7'

However, if I set final_system = process.getSystem(block=False), then this works without issue consistently. Please see the script and input here.

I attempted to test this with SOMD too (to check if the issue was just gromacs -> amber conversion), but the default simfile options meant that SOMD ran too slowly to complete even a short test (I'll raise a separate issue about this).

The reason that I have been setting block=True in getSystem is that this was often returning None instead of a system, despite running process.wait() beforehand, e.g. here. However, I have so far been unable to reproduce this in a short script so far.

  • OS: Ubuntu 22
  • Version of Python: 3.9.15
  • Version of BioSimSpace: latest dev release, installed this morning

Thanks.

[BUG] Coordinate not being updated after em

Describe the bug
I think the new caching might be a bit broken in that if you do an energy minimisation, the coordiante is not being updated when you write it down in the same format.

To Reproduce

import BioSimSpace.Sandpit.Exscientia as BSS
mol = pickle.load(open('mol.p', 'rb'))
vac = mol.toSystem()
vac.setBox(3*[4*BSS.Units.Length.nanometer])

protocol = BSS.Protocol.Minimisation()
process = BSS.Process.Amber(vac, protocol, exe=amber_exe, work_dir="em")
process.start()
process.wait()
assert not process.isError()
em = process.getSystem()

process = BSS.Process.Amber(em, protocol, exe=amber_exe, work_dir="equil")

BSS.IO.saveMolecules('em', em, 'rst7')
BSS.IO.saveMolecules('em', em, 'gro87')

Expected behavior
I would imagine that the coordinate after em equil/amber.rst7 will be different from the coordinate before the em em/amber.rst7.
The same system writing down as amber RST7 em.rst7 will be the same as the Gro file em.gro

But the equil/amber.rst7

BioSimSpace_System
   12
  20.0424958  20.0210173  18.6057628  21.0783599  19.4264583  19.3263467
  21.0356408  19.4053314  20.7203907  19.9570549  19.9786839  21.3938791
  18.9211843  20.5731855  20.6732578  18.9639111  20.5943132  19.2792095
  20.0747780  20.0370579  17.5204133  21.9186800  18.9804402  18.8023066
  21.8422027  18.9423722  21.2813245  19.9238789  19.9618695  22.4795867
  18.0813200  21.0196591  21.1972860  18.1577341  21.0576278  18.7178728
  40.0000000  40.0000000  40.0000000  90.0000000  90.0000000  90.0000000

Is the same as the coordinate before em.

BioSimSpace_System
   12
  20.0424958  20.0210173  18.6057628  21.0783599  19.4264583  19.3263467
  21.0356408  19.4053314  20.7203907  19.9570549  19.9786839  21.3938791
  18.9211843  20.5731855  20.6732578  18.9639111  20.5943132  19.2792095
  20.0747780  20.0370579  17.5204133  21.9186800  18.9804402  18.8023066
  21.8422027  18.9423722  21.2813245  19.9238789  19.9618695  22.4795867
  18.0813200  21.0196591  21.1972860  18.1577341  21.0576278  18.7178728
  40.0000000  40.0000000  40.0000000  90.0000000  90.0000000  90.0000000

And is the same as the em.rst7

BioSimSpace_System
   12
  20.0424958  20.0210173  18.6057628  21.0783599  19.4264583  19.3263467
  21.0356408  19.4053314  20.7203907  19.9570549  19.9786839  21.3938791
  18.9211843  20.5731855  20.6732578  18.9639111  20.5943132  19.2792095
  20.0747780  20.0370579  17.5204133  21.9186800  18.9804402  18.8023066
  21.8422027  18.9423722  21.2813245  19.9238789  19.9618695  22.4795867
  18.0813200  21.0196591  21.1972860  18.1577341  21.0576278  18.7178728
  40.0000000  40.0000000  40.0000000  90.0000000  90.0000000  90.0000000

But this is different from em.gro

BioSimSpace_System
   12
    1LIG    C1x    1   1.984   2.012   1.899
    1LIG    C2x    2   2.099   1.947   1.934
    1LIG    C3x    3   2.104   1.940   2.074
    1LIG    C4x    4   1.992   2.000   2.130
    1LIG    C5x    5   1.918   2.044   2.024
    1LIG    C6x    6   1.946   2.031   1.962
    1LIG    H1x    7   1.949   2.035   1.799
    1LIG    H2x    8   2.173   1.908   1.865
    1LIG    H3x    9   2.179   1.896   2.127
    1LIG    H4x   10   1.968   2.010   2.236
    1LIG    H5x   11   1.824   2.096   2.037
    1LIG    H6x   12   1.861   2.080   1.915
   4.00000   4.00000   4.00000

Which would be the correct energy minimised structure.

Input files
mol.p.zip

(please complete the following information):

  • OS: Linux
  • Version of Python: 3.8
  • Version of BioSimSpace: dev
  • I confirm that I have checked this bug still exists in the latest released version of BioSimSpace: No using our sandpit

Additional context
Add any other context about the problem here.

Optional RDKit run-time dependency for Sire not correctly pinned

Just cross-referencing this issue so that people are aware of the problem for fresh BioSimSpace installs. Until this is resolved, a simple workaround is to pin rdkit to version 2022 during install, e.g.:

mamba create -n biosimspace -c conda-forge -c openbiosim biosimspace rdkit=2022

Unable to generate GROMACS binary run input file

I have some AMBER input files which can be loaded without issue by SOMD and AMBER, but cannot be used with GROMACS.

When I run:

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
os.environ["BSS_VERBOSE_ERRORS"] = "1"
import BioSimSpace as BSS

system = BSS.IO.readMolecules(["bound_preequil.prm7", "bound_preequil.rst7"])

protocol = BSS.Protocol.Production( runtime=1*BSS.Units.Time.nanosecond)

# SOMD - works fine
process = BSS.Process.Somd(system, protocol, platform="CUDA",  work_dir="output", extra_options={"nmoves": 500_000, "ncycles":1})
print("SOMD successful")
# Amber
process = BSS.Process.Amber(system, protocol, work_dir="output")
print("Amber successful")
# Gromacs - does not start
process = BSS.Process.Gromacs(system, protocol, work_dir="output")
print("Gromacs successful")

This gives:

SOMD successful
Amber successful
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/finlayclark/Documents/research/adaptive_sampling/full_run_test/test_gromacs_failure/test_g │
│ romacs.py:17 in <module>                                                                         │
│                                                                                                  │
│   14 process = BSS.Process.Amber(system, protocol, work_dir="output")                            │
│   15 print("Amber successful")                                                                   │
│   16 # Gromacs - does not start                                                                  │
│ ❱ 17 process = BSS.Process.Gromacs(system, protocol, work_dir="output")                          │
│   18 print("Gromacs successful")                                                                 │
│   19                                                                                             │
│                                                                                                  │
│ /home/finlayclark/anaconda3/envs/mamba/envs/openbiosim-dev/lib/python3.9/site-packages/BioSimSpa │
│ ce/Process/_gromacs.py:217 in __init__                                                           │
│                                                                                                  │
│    214 │   │   │   │   │   )                                                                     │
│    215 │   │                                                                                     │
│    216 │   │   # Now set up the working directory for the process.                               │
│ ❱  217 │   │   self._setup()                                                                     │
│    218 │                                                                                         │
│    219def _setup(self):                                                                     │
│    220 │   │   """Setup the input files and working directory ready for simulation."""           │
│                                                                                                  │
│ /home/finlayclark/anaconda3/envs/mamba/envs/openbiosim-dev/lib/python3.9/site-packages/BioSimSpa │
│ ce/Process/_gromacs.py:268 in _setup                                                             │
│                                                                                                  │
│    265 │   │   if isinstance(self._protocol, _Protocol.Custom):                                  │
│    266 │   │   │   self.setConfig(self._protocol.getConfig())                                    │
│    267 │   │   else:                                                                             │
│ ❱  268 │   │   │   self._generate_config()                                                       │
│    269 │   │   self.writeConfig(self._config_file)                                               │
│    270 │   │                                                                                     │
│    271 │   │   # Generate the dictionary of command-line arguments.                              │
│                                                                                                  │
│ /home/finlayclark/anaconda3/envs/mamba/envs/openbiosim-dev/lib/python3.9/site-packages/BioSimSpa │
│ ce/Process/_gromacs.py:376 in _generate_config                                                   │
│                                                                                                  │
│    373 │   │   gromacs_config = _GromacsConfig(                                                  │
│    374 │   │   │   self._system, self._protocol, self._property_map                              │
│    375 │   │   )                                                                                 │
│ ❱  376 │   │   self.setConfig(                                                                   │
│    377 │   │   │   gromacs_config.createConfig(                                                  │
│    378 │   │   │   │   version=_gmx_version,                                                     │
│    379 │   │   │   │   extra_options={**config_options, **self._extra_options},                  │
│                                                                                                  │
│ /home/finlayclark/anaconda3/envs/mamba/envs/openbiosim-dev/lib/python3.9/site-packages/BioSimSpa │
│ ce/Process/_gromacs.py:540 in setConfig                                                          │
│                                                                                                  │
│    537 │   │   super().setConfig(config)                                                         │
│    538 │   │                                                                                     │
│    539 │   │   # Use grompp to generate the portable binary run input file.                      │
│ ❱  540 │   │   self._generate_binary_run_file()                                                  │
│    541 │                                                                                         │
│    542def start(self):                                                                      │
│    543 │   │   """                                                                               │
│                                                                                                  │
│ /home/finlayclark/anaconda3/envs/mamba/envs/openbiosim-dev/lib/python3.9/site-packages/BioSimSpa │
│ ce/Process/_gromacs.py:491 in _generate_binary_run_file                                          │
│                                                                                                  │
│    488 │   │   │   │   │   │   + "\nUse 'ignore_warnings' to ignore warnings."                   │
│    489 │   │   │   │   │   )                                                                     │
│    490 │   │   │   │                                                                             │
│ ❱  491 │   │   │   │   raise RuntimeError(exception_string)                                      │
│    492 │   │   │                                                                                 │
│    493 │   │   │   else:                                                                         │
│    494 │   │   │   │   raise RuntimeError(                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Unable to generate GROMACS binary run input file.

'gmx grompp' reported the following errors:
ERROR 1 [file gromacs.top, line 62]:
  Expected a molecule type name and nrexcl
  ERROR 2 [file gromacs.top, line 18167]:
  Expected a molecule type name and nrexcl
  ERROR 3 [file gromacs.top, line 18167]:
  moleculetype BioSimSpace is redefined

Please see the input.

OS: Ubuntu 22
Version of Python: 3.9.15
Version of BioSimSpace: latest dev release, installed this morning

Thanks.

[BUG] readMolecules faills when passing a tuple of files

This works:

s = BSS.IO.readMolecules(["ala.top", "ala.crd"])

This doesn't

s = BSS.IO.readMolecules(("ala.top", "ala.crd"))

Both used to work, but using a tuple no longer does following the switch to using the new Sire load functionality behind the scenes.

[BUG] extra_options and extra_lines not exposed in FreeEnergy.Relative

When introducing BioSimSpace._Config (adapted from the Exscienta code) I introduced the ability to override configuration defaults by passing the extra_options and extra_lines keyword arguments to specific processes, e.g. here. However, I have forgotten to expose the functionality in BioSimSpace.FreeEnergy.Relative, e.g. so you can override defaults when setting up multi-window simulations with SOMD or GROMACS.

Slow I/O speed [BUG]

Hi,

I am providing the first "easy" testcase that demonstrates the slowness of handling big systems. This is just benzene in a 15x15x15 nm water box, or around 325k atoms. The following script takes ~76 seconds to solvate and ~57 seconds to initialise the Amber process. Interestingly, it seems that there is quite a bit of overhead in saveMolecules, and they don't even constitute most of the total runtime:

image

from functools import wraps
from time import time

import BioSimSpace.Sandpit.Exscientia as BSS


def timing(f):
    @wraps(f)
    def wrap(*args, **kw):
        ts = time()
        result = f(*args, **kw)
        te = time()
        print(f"{f} took {te - ts} seconds")
        return result

    return wrap


mol = BSS.Parameters.parameterise("c1ccccc1", "gaff2").getMolecule()
system = timing(BSS.Solvent.solvate)(
    "tip3p", molecule=mol, box=[15 * BSS.Units.Length.nanometer] * 3
)
print(f"The system has approximately {3 * len(system)} atoms")
protocol = BSS.Protocol.Minimisation()
process = timing(BSS.Process.Amber)(system, protocol)

I will try to get a protein test system as well as these should be much slower than that because of all extra terms they have, but I guess this one is a good system to start the discussion. Also note that the above example doesn't even use squashing, which of course adds extra overhead.

Many thanks.

[BUG] SireError::invalid_cast when merging a ligand with itself

Hi,

I have an issue with some ligands from the OpenFF Benchmark set (so apologies I haven't reproduced it with a faster to parametrise system). The error I get when merging them is the following:

TypeError: SireError::invalid_cast: Cannot cast from an object of class 
"SireMol::AtomPropertyProperty" to an object of class "SireMol::AtomCharges". 
(call sire.error.get_last_error_details() for more info)
import BioSimSpace as BSS

lig = BSS.IO.readMolecules("ligands.sdf")[0]
lig = BSS.Parameters.gaff2(lig).getMolecule()
merged_lig = BSS.Align.merge(lig, lig)

It seems to only happen with some ligands so it is probably something related to the input SDF file.

Many thanks.

[BUG] process.getRecords not getting the same number of values for different columns

Describe the bug
When I run energy_dict = process.getRecords(), I expect that for every key, there should be the same number of values.

To Reproduce

mol_1 = BSS.Parameters.openff_unconstrained_2_0_0("C").getMolecule()
mol_2 = BSS.Parameters.openff_unconstrained_2_0_0("CO").getMolecule()
merged = BSS.Align.merge(mol_1, mol_2)
solvated = BSS.Solvent.tip3p(merged, box=3 * [4 * BSS.Units.Length.nanometer])
protocol = BSS.Protocol.FreeEnergyMinimisation()
process = BSS.Process.Amber(solvated, protocol, exe=amber_gpu)
protocol.start()
process.wait()
new = process.getSystem()
protocol = BSS.Protocol.FreeEnergyEquilibration()
process = BSS.Process.Amber(new, protocol, exe=amber_gpu)
process.start()
process.wait()
energy_dict = process.getRecords()
for key in energy_dict:
    print(len(energy_dict[key]))

Gives

1002
1002
1503
1002
1503
1503
1503
1503
1503
1503
1002
1002
1002
1002
1002
1002
1002
501
501
501
501
501
501
501
1002
501

Expected behavior
The number of values one get from each key should be the same.

[BUG] Incorrect pinning of Sire for 2023.1.2 release

Describe the bug

The 2023.1.2 release was incorrectly pinned against Sire, so import fails due to missing functionality from sire.convert.

To Reproduce

mamba create -n openbiosim -c conda-forge -c openbiosim biosimspace
mamba activate openbiosim
ipython
import BioSimSpace as BSS

As a workaround, pulling packages from the dev label works:

mamba create -n openbiosim-dev -c conda-forge -c openbiosim/label/dev biosimspace

To Fix

I will re-pin and re-build the 2023.1.2 release when Sire 2023.1.3 is released.

Faster MCS Calculation

The current matchAtoms() uses the entire molecule to get MCS, a simple idea to accelerate it is to get MCS for heavy atoms and then match up hydrogens hanging off the MCS.
See this implementation for example:
https://github.com/OpenFreeEnergy/Lomap/blob/84e1bb0d20ceeed835166fe7218dd21f04248030/lomap/mcs.py#L1170

I didn't quantitatively test the speedup(no benchmark found) but it performed very well for large molecules in my project ( the current matchAtoms() often times out).

[BUG] AMBER Process getPressure would not recognise

Describe the bug
The first frame of my amber output says PRESS =********, which is ignored in Process.getPressure, this would mean that getPressure would return a different number of elements compared with other metrics.

To Reproduce

import BioSimSpace.Sandpit.Exscientia as BSS
mol = BSS.Parameters.openff_unconstrained_2_0_0('C').getMolecule()
process = BSS.Process.Amber(mol.toSystem(), BSS.Protocol.FreeEnergyEquilibration(), work_dir='.')
print(len(self.getPressure(True)))
print(len(self.self.getVolume(True)))

There is 26 numbers in Volume but only 25 in pressure.

Expected behavior
Both of them give 26 numbers

Input files
amber.out.zip

[BUG] prepareFEP is broken with recent versions of BSS

Describe the bug
A clear and concise description of what the bug is.
Using

Sire version: 2022.3.0 and BSS version '2022.2.1+351.gf2538d20'

The command below run on the attached input completes succesfully.
inputs.zip

python prepareFEP.py --input1 thrombin_truncated_1a_sort.prm7 thrombin_truncated_1a_sort.rst7 --input2 thrombin_truncated_1b_sort.prm7 thrombin_truncated_1b_sort.rst7 --output 1a~1b

When using
Sire version: 2023.2.2 and BSS version '2023.3.0.dev+25.g945779ae'

The above command gives

  File "/home/matthew/jm/prepareFEP/prepareFEP.py", line 17, in <module>
    from Sire.Mol import AtomIdx
ModuleNotFoundError: No module named 'Sire'

Replacing the above statement with

from sire.mol import AtomIdx

and running again gives

Cannot import sire using the mixed API as either the old or new APIs have already been activated.
(...)
AttributeError: type object 'MoleculeParser' has no attribute 'supportedFormats'

Swapping the order of the import statements from

from sire.mol import AtomIdx
import BioSimSpace as BSS

to

import BioSimSpace as BSS
from sire.mol import AtomIdx

Resolves the issue.

Not sure I understand why. If the above fix is safe then I suggest updating the version of prepareFEP at

https://github.com/OpenBioSim/biosimspace/tree/devel/nodes/playground

to be compatible with devel .

[BUG] steered MD broken for AMBER

Please change line 257 of _amber.py to:

            extra_options["plumedfile"] = "'plumed.dat'"

(i.e. add additional single quotes around 'plumed.dat').

The amber.cfg file should have:

  plumedfile='plumed.dat',

the single quotes are required. They must have been lost in changes made to the process code!

Confusion over FEP protocol

Hello,

I have been reading through the documentation for BSS.FreeEnergy and am feeling a little confused by the description of the module, specifically by the reference to PMF computation. Perhaps I have misunderstood, but my belief is that PMF is when geometric reaction coordinates are provided and a 'pulling' simulation is used to remove the ligand from the binding site. This differs from conventional FEP, where a series of alchemical states (between lambda 0 and 1) are simulated and estimators are used to extract the energy associated between transition from 0-1.

Would it be possible to clarify how the FEP protocol in BSS works, and perhaps provide some pointers to further documentation/tutorials if they are available?

Thank you

Passing kwargs to LOMAP using Align.generateNetwork()

Hi,

I'm looking to use BioSimSpace to run FEP calculations, however I would like to generate a minimal spanning graph using a radial network. This is an option in the original lomap implementation - see example here. It appears like there's no functionality to request a radial graph using BioSimSpace, would it be straightforward to pass kwargs to LOMAP in order to achieve this?

[BUG] rhombic dodecahedron triclinic spaces not working with OpenMM

The rhombic dodecahedron box parameters generated by Sire/BioSimSpace are not working with OpenMM. This is true both before and after the rounding fix introduces in OpenBioSim/sire#51. Internally OpenMM uses the same lattice reduction as us (it checks for reduced box dimensions), so I'm surprised this is not working. Other engines are fine with the parameters. Hopefully it's possible to apply a specific fix for this box type.

In [1]: import BioSimSpace as BSS

In [2]: mol = BSS.Parameters.openff_unconstrained_2_0_0("C").getMolecule()

In [3]: box, angles = BSS.Box.rhombicDodecahedronHexagon(3*BSS.Units.Length.nanometer)

In [4]: solvated = BSS.Solvent.tip3p(mol, box=box, angles=angles)

In [5]: p = BSS.Protocol.Minimisation(steps=1000)

In [6]: process = BSS.Process.OpenMM(solvated, p)

In [7]: process.start()
Out[7]: BioSimSpace.Process.OpenMM(<BioSimSpace.System: nMolecules=622>, BioSimSpace.Protocol.Minimisation(steps=1000, restraint=None, force_constant=10.0 kcal_per_mol/angstrom**2), exe='/home/lester/.conda/envs/openbiosim/bin/sire_python', name='openmm', platform='CPU', work_dir='/tmp/tmpijqr1ske', seed=None)

In [8]: process.wait()

In [9]: process.isError()
Out[9]: True

In [10]: process.stderr()
/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
Traceback (most recent call last):
  File "/tmp/tmpijqr1ske/openmm_script.py", line 34, in <module>
    simulation = Simulation(prmtop.topology,
  File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/app/simulation.py", line 103, in __init__
    self.context = mm.Context(self.system, self.integrator, platform, platformProperties)
  File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/openmm.py", line 10069, in __init__
    _openmm.Context_swiginit(self, _openmm.new_Context(*args))
openmm.OpenMMException: NonbondedForce: The cutoff distance cannot be greater than half the periodic box size.  For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#boxsize

In [11]: box, angles = BSS.Box.rhombicDodecahedronSquare(3*BSS.Units.Length.nanometer)

In [12]: solvated = BSS.Solvent.tip3p(mol, box=box, angles=angles)

In [13]: process = BSS.Process.OpenMM(solvated, p)

In [14]: process.start()
Out[14]: BioSimSpace.Process.OpenMM(<BioSimSpace.System: nMolecules=632>, BioSimSpace.Protocol.Minimisation(steps=1000, restraint=None, force_constant=10.0 kcal_per_mol/angstrom**2), exe='/home/lester/.conda/envs/openbiosim/bin/sire_python', name='openmm', platform='CPU', work_dir='/tmp/tmpmd9xk4t6', seed=None)

In [15]: process.wait()

In [16]: process.isError()
Out[16]: True

In [17]: process.stderr()
Traceback (most recent call last):
  File "/tmp/tmpmd9xk4t6/openmm_script.py", line 16, in <module>
    prmtop = AmberPrmtopFile('openmm.prm7')
  File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/app/amberprmtopfile.py", line 156, in __init__
    top.setPeriodicBoxVectors(computePeriodicBoxVectors(*(box[1:4] + box[0:1]*3)))
  File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/app/internal/unitcell.py", line 60, in computePeriodicBoxVectors
    cz = math.sqrt(c_length*c_length-cx*cx-cy*cy)
ValueError: math domain error

In both cases the box dimensions and angles are stable, and are the same before/after the fix, i.e.:

In [1]: import BioSimSpace as BSS

In [2]: box, angles = BSS.Box.rhombicDodecahedronHexagon(3*BSS.Units.Length.nanometer)

In [3]: mol = BSS.Parameters.openff_unconstrained_2_0_0("C").getMolecule()

In [4]: solvated = BSS.Solvent.tip3p(mol, box=box, angles=angles)

In [5]: solvated.getBox()
Out[5]:
([30.0000 A, 25.9856 A, 30.0000 A],
 [73.7993 degrees, 120.0000 degrees, 88.8975 degrees])

In [6]: box, angles = BSS.Box.rhombicDodecahedronSquare(3*BSS.Units.Length.nanometer)

In [7]: solvated = BSS.Solvent.tip3p(mol, box=box, angles=angles)

In [8]: solvated.getBox()
Out[8]:
([30.0000 A, 30.0000 A, 30.0000 A],
 [120.0000 degrees, 120.0000 degrees, 90.0000 degrees])

Matching atoms and merging ring systems (fused rings, different hybridisation states) - issues

Hello! I'm currently matching and merging molecules for MCL1 ligands. The basic code used to replicate these is:

# ligands
lig0 = "lig_27"
lig1 = "lig_42"
# options
prematch = {}
ring_size = False
ring_break = False
complete_ring = True

lig_0 = BSS.IO.readMolecules([f"{lig0}.rst7", f"{lig0}.prm7"])
lig_1 = BSS.IO.readMolecules([f"{lig1}.rst7", f"{lig1}prm7"])

mapping = BSS.Align.matchAtoms(
                            lig_0, lig_1,
                            complete_rings_only=complete_ring,
                            prematch=prematch,
                            )    

inv_mapping = {v: k for k, v in mapping.items()}
lig_1_a = BSS.Align.rmsdAlign(lig_1, lig_0, inv_mapping)

merged_ligands = BSS.Align.merge(lig_0, lig_1_a, mapping,
                                allow_ring_breaking=ring_break,
                                allow_ring_size_change=complete_ring
                                  )

With the files named as the ligands attached.
ligands.zip

There are some different situations that occur in relation to ring systems, outlined below.

Case 1

In the most basic case, as in the case of some ligands (lig_27~lig_42), this does not proceed correctly initially. Many of the atoms are included in the perturbable region and some of the shared atoms are not included in the MCS.
image

When the mapping dictionary {21:6} is added as prematch, this is then able to proceed okay. Prematch for this ligand series is chosen so the N in the core region match.
image
Notably, this is able to proceed with ring_size=False and ring_break=False and the ring is mapped correctly to the fused ring system, which is not the case in Case 2.

Case 2

However in a another case, lig_27~lig_43, when the mapping dictionary{ 21:26 } which is also the N is added, this is not able to proceed without setting ring_size=True, ring_break=True . In this case then the resulting perturbed region is like below:
image
I think ideally, with the introduced mapping, this should still be able to proceed with ring_size and ring_break set as False, and the whole ring region in this case is taken as the perturbed region in each case. So for lig_27 the phenyl ring and for lig_43 the entire fused ring system. Then if these were both set to True, in that case the phenyl ring should ideally be mapped onto an entire ring in the fused phenyl as opposed to split between the two fused rings.
The difference between lig_42 and lig_43 in this case is minor, with lig_42 being a 1-napthyl group and lig_43 being a 2-napthyl group, but this causes a large difference in the mapping.

Case 3

Here, lig_43~lig_45, there is a change of hybridisation state with the rings. Again, without prematch, the core region of the molecule is ignored:
image

And when prematch {26:6} is introduced, with ring_size=False, ring_break=False it then maps the differently hybridised C to each other. The differently hybridised atoms should not map to each other as this is not great for the free energy perturbation.
image

I was wondering if there would be some way to programmatically address the mapping and merging ring perturbations so the intended perturbable molecule can be obtained? I guess the atoms could also be removed manually from the generated mapping that is passed into the merge function, however this requires quite some time per perturbation to do manually as each atom needs to be inspected then.

Thank you!

[BUG] Adding a cofactor

Hi,

I have tried to pull all togheter a protein a ligand a and B and a cofactor. I managed to run the parametrization of the four components. Also before to merge all togheter I create the mapping between the two ligands (A and B), then I merged then after that using the next line complx = merged + RNMT + cofactor.

The I tried to minimize the system but it fails to do that. The error is: RuntimeError: 'gmx genion' failed to add ions! Perhaps your box is too small?. I increase the size of the box, but it fails no matter the size. When I merge only protein and merged ligands, it works well.

Thanks a lot

Cesar

[BUG] Check for non-periodic Cartesian space in process setup

I'm not sure this wasn't caught by the local CI for the release, but we need to add a few more checks for a non-periodic space when setting up a process. (Sire now adds a default Cartesian space if no property is present.) This was fixed for NAMD here, but needs to be applied for all engines.

[BUG] BioSimSpace.Align.flexAlign is broken

Describe the bug
Calling BioSimSpace.Align.matchAtoms with the scoring function ''rmsd_flex_align '' causes a runtime error.
The option "rmsd_align" completes without runtime error on the provide inputs.

To Reproduce
Steps to reproduce the behavior:

import BioSimSpace as BSS
lig0 = BSS.IO.readMolecules(["lig_27_lig.prm7","lig_27_lig.rst7"])[0]
lig1 = BSS.IO.readMolecules(["lig_59_lig.prm7","lig_59_lig.rst7"])[0]
mapping = BSS.Align.matchAtoms(lig0, lig1,scoring_function="rmsd_align")  
len(mapping)
mapping = BSS.Align.matchAtoms(lig0, lig1,scoring_function="rmsd_flex_align")   

image

Expected behavior
The above code should produce a mapping between the two supplied inputs. It should not raise
a NameError exception.

Input files
attached
flexalign.zip

(please complete the following information):

  • OS: [e.g. Linux (distribution), MacOS (version), Windows (version)] Tested on JupyterHub server at try.openbiosim.org
  • Version of Python: [e.g. 3.8, 3.9 etc.] 3.9.13
  • Version of BioSimSpace: [e.g. 2023.1, or latest dev release] 2023.2..0
  • I confirm that I have checked this bug still exists in the latest released version of BioSimSpace: [yes]

[BUG] Protocol mixin inheritance loses base class attributes

The current order of inheritance means that the BioSimSpace.Protocol base class attributes are lost in derived classes. While this doesn't affect running via scripts, or any existing tests, printing a protocol within an interactive session will raise an exception due to the missing attribute.

The solution is to re-order the inheritance, so that the mixins are standalone classes and don't inherit from the base class.

[BUG] Angles will need to be updated in tests/_SireWrappers/test_system.py::test_set_box

Just noting here so that I don't forget...

When the triclinic space fix is merged into Sire we will need to update the test angles in test_set_box. At the moment they are:

expected_angles = [109.4712, 70.5288, 70.5288] * BSS.Units.Angle.degree

This needs to be changed to:

expected_angles = [70.5288, 109.4712, 70.5288] * BSS.Units.Angle.degree

Note that the new angles are in agreement with what GROMACS expects for a default truncated octahedron, as described here. (Note that there are some typo in this page so the exact angles are wrong.)

With the update to Sire we would get:

In [1]: import BioSimSpace as BSS

In [2]: box, angles = BSS.Box.truncatedOctahedron(30*BSS.Units.Length.angstrom)

In [3]: angles
Out[3]: [70.5288 degrees, 109.4712 degrees, 70.5288 degrees]

[BUG] Unable to save to RST7 file

Describe the bug
I have a Gromacs topology and coordinate file and I tried to save it as RST7 file

To Reproduce

In [1]: import BioSimSpace.Sandpit.Exscientia as BSS

In [7]: mol = BSS.IO.readMolecules(['test.top', 'test.gro'])

In [8]: BSS.IO.saveMolecules('test', mol, 'rst7')
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ /exs/shared/collaboration/teams/mdteam/envs/asfe/lib/python3.8/site-packages/BioSimSpace/Sandpit │
│ /Exscientia/IO/_io.py:757 in saveMolecules                                                       │
│                                                                                                  │
│    754 │   │   │   if _isVerbose():                                                              │
│    755 │   │   │   │   raise IOError(msg) from e                                                 │
│    756 │   │   │   else:                                                                         │
│ ❱  757 │   │   │   │   raise IOError(msg) from None                                              │
│    758 │                                                                                         │
│    759 │   # Return the list of files.                                                           │
│    760 │   return files                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: Failed to save system to format: 'rst7'

Expected behavior
It writes

Input files
Archive.zip

(please complete the following information):

  • OS: Linux
  • Version of Python: 3.8
  • Version of BioSimSpace: [e.g. 2023.1, or latest dev release]
  • sire 2023.3.0 py38h3fd9d12_0 openbiosim/label/main
  • biosimspace 2023.3.0 py38_0 openbiosim/label/main
  • I confirm that I have checked this bug still exists in the latest released version of BioSimSpace: yes

Details of BSS parallelisation options / schemes

Hi,

I'm looking at running BSS.FreeEnergy.Relative calculations. I would like to have a better understanding of how this process uses multiple cores and GPus, running either via SOMD or GROMACS engine.

For example, if I wish to calculate RBFE differences between a set of 10 mols, I would like to know whether I would see large benefits in optimising the process beyond running mol_a, mol_b edges in serial using .run() methods. Or, does BSS automatically scale single simulations to use the maximum compute, and thus trying to run edges in parallel would be pointless?

There are multiple compute scenarios in which I wish to run these calculations.

  1. Using my workstation which is an 8 core system with a consumer RTX GPU.
  2. Using a larger compute node/VM with 64 vCPUs and 4 GPUs.
  3. In the future, I may perform these simulations using a cloud system where I can spawn VMs when necessary.

I have looked at the documentation but haven't been able to find much information.

Thanks

[BUG] Sire file caching breaking process.getSystem() when used interactively

I think that this might be fixed with the latest Sire trajectory code, but for certain binary formats, e.g. AMBER rst files, the Sire file cache means that a cached file will be re-used on read if the file name matches. This breaks the process.getSystem() functionality for BioSimSpace when used interactively, since you will keep getting back the same set of coordinates that were returned by the initial call to the method.

As a simple example:

In [1]: import BioSimSpace as BSS

In [2]: s0 = BSS.IO.readMolecules(["test0.rst", "test.top"])

In [3]: s1 = BSS.IO.readMolecules(["test1.rst", "test.top"])

In [4]: cp test1.rst test0.rst

In [5]: s2 = BSS.IO.readMolecules(["test0.rst", "test.top"])

In [6]: s0[0].coordinates()
Out[6]:
[(13.6813 A, 13.1482 A, 15.2733 A),
 (13.6813 A, 14.2382 A, 15.2733 A),
 (13.1676 A, 14.6020 A, 16.1632 A),
 (13.1676 A, 14.6020 A, 14.3835 A),
 (15.1088 A, 14.7890 A, 15.2733 A),
 (16.0719 A, 14.0256 A, 15.2733 A),
 (15.2367 A, 16.1178 A, 15.2733 A),
 (14.4145 A, 16.7043 A, 15.2733 A),
 (16.5346 A, 16.7621 A, 15.2733 A),
 (17.0889 A, 16.4637 A, 16.1632 A),
 (17.3426 A, 16.3690 A, 14.0412 A),
 (16.8046 A, 16.6695 A, 13.1421 A),
 (18.3118 A, 16.8671 A, 14.0676 A),
 (17.4899 A, 15.2890 A, 14.0320 A),
 (16.3940 A, 18.2776 A, 15.2733 A),
 (15.2820 A, 18.8009 A, 15.2734 A),
 (17.5274 A, 18.9831 A, 15.2734 A),
 (18.4183 A, 18.5073 A, 15.2733 A),
 (17.5274 A, 20.4321 A, 15.2734 A),
 (16.4999 A, 20.7959 A, 15.2734 A),
 (18.0411 A, 20.7959 A, 16.1632 A),
 (18.0411 A, 20.7959 A, 14.3835 A)]

In [7]: s1[0].coordinates()
Out[7]:
[(13.5716 A, 13.2645 A, 15.6562 A),
 (13.5789 A, 14.3357 A, 15.4607 A),
 (12.9188 A, 14.8503 A, 16.1556 A),
 (13.2508 A, 14.5096 A, 14.4374 A),
 (14.9839 A, 14.8623 A, 15.6344 A),
 (15.8174 A, 14.1710 A, 16.2109 A),
 (15.2255 A, 16.0742 A, 15.1400 A),
 (14.4553 A, 16.5900 A, 14.7412 A),
 (16.5121 A, 16.7744 A, 15.1675 A),
 (17.0302 A, 16.5318 A, 16.0974 A),
 (17.3649 A, 16.2773 A, 13.9880 A),
 (16.8675 A, 16.5102 A, 13.0454 A),
 (18.3440 A, 16.7586 A, 14.0058 A),
 (17.5069 A, 15.1985 A, 14.0639 A),
 (16.3136 A, 18.3070 A, 15.1113 A),
 (15.2321 A, 18.7896 A, 14.7589 A),
 (17.3760 A, 19.0610 A, 15.4345 A),
 (18.2254 A, 18.5853 A, 15.7018 A),
 (17.4120 A, 20.5239 A, 15.4079 A),
 (16.4059 A, 20.9376 A, 15.5047 A),
 (18.0217 A, 20.8950 A, 16.2344 A),
 (17.8478 A, 20.8669 A, 14.4677 A)]

In [8]: s2[0].coordinates()
Out[8]:
[(13.6813 A, 13.1482 A, 15.2733 A),
 (13.6813 A, 14.2382 A, 15.2733 A),
 (13.1676 A, 14.6020 A, 16.1632 A),
 (13.1676 A, 14.6020 A, 14.3835 A),
 (15.1088 A, 14.7890 A, 15.2733 A),
 (16.0719 A, 14.0256 A, 15.2733 A),
 (15.2367 A, 16.1178 A, 15.2733 A),
 (14.4145 A, 16.7043 A, 15.2733 A),
 (16.5346 A, 16.7621 A, 15.2733 A),
 (17.0889 A, 16.4637 A, 16.1632 A),
 (17.3426 A, 16.3690 A, 14.0412 A),
 (16.8046 A, 16.6695 A, 13.1421 A),
 (18.3118 A, 16.8671 A, 14.0676 A),
 (17.4899 A, 15.2890 A, 14.0320 A),
 (16.3940 A, 18.2776 A, 15.2733 A),
 (15.2820 A, 18.8009 A, 15.2734 A),
 (17.5274 A, 18.9831 A, 15.2734 A),
 (18.4183 A, 18.5073 A, 15.2733 A),
 (17.5274 A, 20.4321 A, 15.2734 A),
 (16.4999 A, 20.7959 A, 15.2734 A),
 (18.0411 A, 20.7959 A, 16.1632 A),
 (18.0411 A, 20.7959 A, 14.3835 A)]

As you can see, the coordinates for s0 and s1 differ, but s2 has the same coordinates as s0 when it should match s1. Files to reproduce are here. I'm not sure what other formats this is also an issue for. I've tested some others, e.g. rst7, and it works as expected.

@chryswoods: Do you know if this is no-longer an issue? I thought the cache was cleared when all files were parsed, but maybe this isn't true for certain formats. The above caused me a bit of a headache debugging something since the coordinates from getSystem() mismatched those from last frame of the the trajectory file obtained with getTrajectory(). Maybe there's a simple way of clearing the cache, or disabling it when I know a user is working interactively.

Issue reading in PDB/top file

Hi all,

I've been having a strange issue reading in a PDB/top files using BSS, however it's not entirely clear to me that this is an issue with BSS or my setup. The reason I say this is that I've had no issue with these files on one of my work stations, however have been encountering the issue on a cloud server. What's strange is both the work station and the server use the same version of BSS (2023.1.2) and Sire (2023.1.4). I have also tried using the latest version of BSS on the web server and have encountered the same issue.

Simply, I'm trying to read in the molecule using
BSS.IO.readMolecules(['protein_pdb2gmx.pdb', 'topol.top'])

On my work station this isn't a problem and returns
<BioSimSpace.System: nMolecules=1>

However on the server I receive the following error.

OSError: SireError::io_error: Cannot load the molecules: There is not enough information in this parser (PDB2( 
nMolecules() = 1, nChains() = 0, nResidues() = 288, nAtoms() = 4644 )) to start the creation of a new System. You 
need to use a more detailed input file. (call sire.error.get_last_error_details() for more info)
...
OSError: Failed to read molecules from: ['protein_pdb2gmx.pdb', 'topol.top']

If I try to read in just the PDB file then this error no longer exists.

The files are located here

I'm a little stuck as to what to try next, any suggestions would be welcome.

[BUG] AMBER output parser gives wrong number of values in dof=2

Describe the bug
Problem with parsing the amber output file. For the state where there is softcore region in both TI region 1 and TI region 2. The current parser gives wrong number of temperature in dof=2

To Reproduce
The example output would be

| TI region  1


 NSTEP =      600   TIME(PS) =       1.200  TEMP(K) =   299.61  PRESS =     0.0
 Etot   =    -17683.1326  EKtot   =      3854.5497  EPtot      =    -21537.6823
 BOND   =         7.4462  ANGLE   =       101.3144  DIHED      =         3.5441
 1-4 NB =         7.4910  1-4 EEL =       -33.9679  VDWAALS    =      3115.5266
 EELEC  =    -24743.2707  EHBOND  =         0.0000  RESTRAINT  =         4.2340
 EAMBER (non-restraint)  =    -21541.9164
 EKCMT  =         0.0000  VIRIAL  =         0.0000  VOLUME     =     65483.6402
                                                    Density    =         0.9930
 TEMP0  =       298.0000  REPNUM  =              1  EXCHANGE#  =              1
 ------------------------------------------------------------------------------

  Softcore part of the system:       3 atoms,       TEMP(K)    =           0.00
 SC_Etot=         0.6102  SC_EKtot=         0.0000  SC_EPtot   =         0.6102
 SC_BOND=         0.0000  SC_ANGLE=         0.6090  SC_DIHED   =         0.0012
 SC_14NB=         0.0000  SC_14EEL=         0.0000  SC_VDW     =         0.0000
 SC_EEL =         0.0000
 SC_RES_DIST=     0.0000  SC_RES_ANG=       0.0000  SC_RES_TORS=         0.0000
 SC_EEL_DER=      0.0000  SC_VDW_DER=       0.0000  SC_DERIV   =         0.0000
 ------------------------------------------------------------------------------


| TI region  2


 NSTEP =      600   TIME(PS) =       1.200  TEMP(K) =   299.61  PRESS =     0.0
 Etot   =    -17683.1326  EKtot   =      3854.5497  EPtot      =    -21537.6823
 BOND   =         7.4462  ANGLE   =       101.3144  DIHED      =         3.5441
 1-4 NB =         7.4910  1-4 EEL =       -33.9679  VDWAALS    =      3115.5266
 EELEC  =    -24743.2707  EHBOND  =         0.0000  RESTRAINT  =         4.2340
 EAMBER (non-restraint)  =    -21541.9164
 EKCMT  =         0.0000  VIRIAL  =         0.0000  VOLUME     =     65483.6402
                                                    Density    =         0.9930
 TEMP0  =       298.0000  REPNUM  =              1  EXCHANGE#  =              1
 ------------------------------------------------------------------------------

  Softcore part of the system:       3 atoms,       TEMP(K)    =           0.00
 SC_Etot=         0.6102  SC_EKtot=         0.0000  SC_EPtot   =         0.6102
 SC_BOND=         0.0000  SC_ANGLE=         0.6090  SC_DIHED   =         0.0012
 SC_14NB=         0.0000  SC_14EEL=         0.0000  SC_VDW     =         0.0000
 SC_EEL =         0.0000
 SC_RES_DIST=     0.0000  SC_RES_ANG=       0.0000  SC_RES_TORS=         0.0000
 SC_EEL_DER=      0.0000  SC_VDW_DER=       0.0000  SC_DERIV   =         0.0000
 ------------------------------------------------------------------------------

If you replace https://github.com/OpenBioSim/biosimspace/blob/devel/tests/Sandpit/Exscientia/output/amber_fep.out with
amber_fep.out.zip. The test will fail.

Expected behavior
One might need two dof for the softcore region
(please complete the following information):

  • OS: [e.g. Linux (distribution), MacOS (version), Windows (version)]
  • Version of Python: [e.g. 3.8, 3.9 etc.]
  • Version of BioSimSpace: [e.g. 2023.1, or latest dev release]
  • I confirm that I have checked this bug still exists in the latest released version of BioSimSpace: [yes/no]

Additional context
Add any other context about the problem here.

[BUG] process.getDensity broken

Describe the bug
The process.getDensity method is broken

To Reproduce

import BioSimSpace as BSS
mol = BSS.Parameters.openff_unconstrained_2_0_0("CO").getMolecule()
solvated = BSS.Solvent.tip3p(mol, box=3 * [4 * BSS.Units.Length.nanometer])
protocol = BSS.Protocol.Equilibration()
process = BSS.Process.Amber(solvated, protocol, exe=amber_gpu)
process.getDensity()

gives

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ /sharedfs-home/zwu/src/biosimspace/python/BioSimSpace/Sandpit/Exscientia/Process/_amber.py:1669  │
│ in getDensity                                                                                    │
│                                                                                                  │
│   1666 │   │   density : float                                                                   │
│   1667 │   │      The density.                                                                   │
│   1668 │   │   """                                                                               │
│ ❱ 1669 │   │   return self.getRecord("DENSITY", time_series, block)                              │
│   1670 │                                                                                         │
│   1671 │   def getCurrentDensity(self, time_series=False):                                       │
│   1672 │   │   """                                                                               │
│                                                                                                  │
│ /sharedfs-home/zwu/src/biosimspace/python/BioSimSpace/Sandpit/Exscientia/Process/_amber.py:809   │
│ in getRecord                                                                                     │
│                                                                                                  │
│    806 │   │   if self.isError():                                                                │
│    807 │   │   │   _warnings.warn("The process exited with an error!")                           │
│    808 │   │                                                                                     │
│ ❱  809 │   │   return self._get_stdout_record(record.strip().upper(), time_series, unit)         │
│    810 │                                                                                         │
│    811 │   def getCurrentRecord(self, record, time_series=False, unit=None):                     │
│    812 │   │   """                                                                               │
│                                                                                                  │
│ /sharedfs-home/zwu/src/biosimspace/python/BioSimSpace/Sandpit/Exscientia/Process/_amber.py:1829  │
│ in _get_stdout_record                                                                            │
│                                                                                                  │
│   1826 │   │   if unit is not None:                                                              │
│   1827 │   │   │   if not isinstance(unit, _Type):                                               │
│   1828 │   │   │   │   print(unit)                                                               │
│ ❱ 1829 │   │   │   │   raise TypeError("'unit' must be of type 'BioSimSpace.Types'")             │
│   1830 │   │                                                                                     │
│   1831 │   │   # Return the list of dictionary values.                                           │
│   1832 │   │   if time_series:                                                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: 'unit' must be of type 'BioSimSpace.Types'

Expected behavior
Runs

Should use keyword argument in https://github.com/OpenBioSim/biosimspace/blob/devel/python/BioSimSpace/Process/_amber.py#L596 instead of position argument.

[BUG] AMBER time series records are non-deterministic

Describe the bug
So when I run a simulation with BSS.Protocol.Production(report_interval=1), I think I shall get at least runtime/timestep/report_interval number of records in with process.getRecords().

To Reproduce

import BioSimSpace.Sandpit.Exscientia as BSS
mol = BSS.Parameters.openff_unconstrained_2_0_0('C').getMolecule()
protocol = BSS.Protocol.Production(report_interval=1)
process = BSS.Process.Amber(mol.toSystem(), protocol, exe=amber_exe)
process.start()
process.wait()
process.getRecords()

Gives {}
Expected behavior
I shall get a dictionary in the format of

{'NSTEP': ['0', '278800'],
 'TIME(PS)': ['0.000', '557.600'],
 'TEMP(K)': ['452.66', '293.49'],
 'PRESS': ['0.0', '0.0'],
 'ETOT': ['-14835.3346', '-14835.3346'],
 'EKTOT': ['5845.1152', '5845.1152'],
 'EPTOT': ['-20680.4498', '-20680.4498'],
 'BOND': ['3.1693', '3.1693'],
 'ANGLE': ['18.6614', '18.6614'],
 'DIHED': ['3.0760', '3.0760'],
 '1-4 NB': ['4.2537', '4.2537'],
 '1-4 EEL': ['-0.1536', '-0.1536'],
 'VDWAALS': ['3151.0853', '3151.0853'],
 'EELEC': ['-23860.5419', '-23860.5419'],
 'EHBOND': ['0.0000', '0.0000'],
 'RESTRAINT': ['0.0000', '0.0000'],
 'EKCMT': ['0.0000', '0.0000'],
 'VIRIAL': ['0.0000', '0.0000'],
 'VOLUME': ['64000.0000', '64000.0000'],
 'DENSITY': ['1.0122', '1.0122']}

Where there is runtime/timestep/report_interval number of records for each entry.

(please complete the following information):

  • OS: Linux
  • Version of Python: 3.8
  • Version of BioSimSpace: dev
  • I confirm that I have checked this bug still exists in the latest released version of BioSimSpace: no

Additional context
Add any other context about the problem here.

[BUG] Coordinates from GROMACS vacuum simulations can cause overflow as input to AMBER

For GROMACS we implement vacuum simulations by using pseudo-PBC conditions, i.e. running calculations in a near infinite box, as described here. (This approach works with all versions of GROMACS that we support.) However, a result of this is that absolute molecular coordinates can end up being really large, so can't be used as input to other engines, e.g. AMBER, since they overflow the formatting restrictions of the input file. For example:

In [1]: import BioSimSpace as BSS

INFO:numexpr.utils:Note: NumExpr detected 20 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
INFO:numexpr.utils:NumExpr defaulting to 8 threads.
WARNING:root:Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.

In [2]: BSS.setVerbose(True)

In [3]: m = BSS.Parameters.openff_unconstrained_2_0_0("CC").getMolecule()

In [4]: p = BSS.Protocol.Minimisation()

In [5]: process = BSS.Process.Gromacs(m.toSystem(), p)
/home/lester/Code/openbiosim/biosimspace/python/BioSimSpace/Process/_gromacs.py:294: UserWarning: No simulation box found. Assuming gas phase simulation.
  _warnings.warn("No simulation box found. Assuming gas phase simulation.")
/home/lester/Code/openbiosim/biosimspace/python/BioSimSpace/_Config/_config.py:96: UserWarning: No simulation box found. Assuming gas phase simulation.
  _warnings.warn("No simulation box found. Assuming gas phase simulation.")

In [6]: process.start()
Out[6]: BioSimSpace.Process.Gromacs(<BioSimSpace.System: nMolecules=1>, BioSimSpace.Protocol.Minimisation(steps=10000, restraint=None, force_constant=10.0 kcal_per_mol/angstrom**2), exe='/home/lester/.conda/envs/openbiosim/bin.AVX2_256/gmx', name='gromacs', work_dir='/tmp/tmpyuhwqevj', seed=None)

In [7]: process.wait()

In [8]: s = process.getSystem()

In [9]: s[0].coordinates()
Out[9]:
[(9998.2400 A, 0.0200 A, 9999.0200 A),
 (9999.7600 A, -0.0200 A, 9998.9800 A),
 (9997.8600 A, -0.8300 A, 9999.6000 A),
 (9997.8900 A, 0.9500 A, 9999.4900 A),
 (9997.8300 A, -0.0400 A, 9998.0100 A),
 (1.0000e+04 A, -0.9200 A, 9998.4600 A),
 (1.0000e+04 A, 0.8600 A, 9998.4500 A),
 (1.0000e+04 A, -0.0300 A, 9999.9900 A)]

In [10]: BSS.IO.saveMolecules("test", s, "rst7")
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/lester/Code/openbiosim/biosimspace/python/BioSimSpace/IO/_io.py:730 in saveMolecules       │
│                                                                                                  │
│    727 │   │   │   │   _os.rename(file, new_file)                                                │
│    728 │   │   │   │   file = [new_file]                                                         │
│    729 │   │   │   else:                                                                         │
│ ❱  730 │   │   │   │   file = _SireIO.MoleculeParser.save(                                       │
│    731 │   │   │   │   │   system._sire_object, filebase, _property_map                          │
│    732 │   │   │   │   )                                                                         │
│    733                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: SireError::io_error: Cannot write the (perhaps some of the ) files as the following errors occurred:
Failed to write the file '/home/lester/Downloads/vac_issue/test.rst7' using the parser for fileformat 'RST7'. Errors reported by the parser
are:
Errors converting the system to a Amber Rst7 format...
Could not write the float at index 15, value '10000.1' into the specified format AmberFormat( 6 x float[width = 12, precision = 7] ).
Could not write the float at index 18, value '10000.1' into the specified format AmberFormat( 6 x float[width = 12, precision = 7] ).
Could not write the float at index 21, value '10000.2' into the specified format AmberFormat( 6 x float[width = 12, precision = 7] ). (call
sire.error.get_last_error_details() for more info)

The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ /home/lester/Code/openbiosim/biosimspace/python/BioSimSpace/IO/_io.py:742 in saveMolecules       │
│                                                                                                  │
│    739 │   │   except Exception as e:                                                            │
│    740 │   │   │   msg = "Failed to save system to format: '%s'" % format                        │
│    741 │   │   │   if _isVerbose():                                                              │
│ ❱  742 │   │   │   │   raise IOError(msg) from e                                                 │
│    743 │   │   │   else:                                                                         │
│    744 │   │   │   │   raise IOError(msg) from None                                              │
│    745                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: Failed to save system to format: 'RST7'

I'm confused why this is happening since the molecule should be placed at the centre of the box, but perhaps this isn't working correctly. For a minimisation it should be easy enough to translate it back to some reasonable place afterwards, but maybe not for other protocols where a user might want a trajectory.

[BUG] Cannot read the topology

Describe the bug
Cannot read the topology

To Reproduce

import BioSimSpace as BSS
BSS.IO.readMolecules(('Mcl1_vacuum.parm7', 'Mcl1_vacuum.rst7'))

Got

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/ubuntu/miniconda3/envs/BSSOrion/lib/python3.8/site-packages/BioSimSpace/IO/_io.py:453 in   │
│ readMolecules                                                                                    │
│                                                                                                  │
│    450 │                                                                                         │
│    451 │   # Try to read the files and return a molecular system.                                │
│    452 │   try:                                                                                  │
│ ❱  453 │   │   system = _patch_sire_load(                                                        │
│    454 │   │   │   files,                                                                        │
│    455 │   │   │   directory=str(download_dir),                                                  │
│    456 │   │   │   property_map=property_map,                                                    │
│                                                                                                  │
│ /home/ubuntu/miniconda3/envs/BSSOrion/lib/python3.8/site-packages/BioSimSpace/IO/_io.py:1078 in  │
│ _patch_sire_load                                                                                 │
│                                                                                                  │
│   1075 │                                                                                         │
│   1076 │   for i in range(0, len(paths)):                                                        │
│   1077 │   │   # resolve the paths, downloading as needed                                        │
│ ❱ 1078 │   │   p += _sire._load._resolve_path(paths[i], directory=directory, silent=silent)      │
│   1079 │                                                                                         │
│   1080 │   paths = p                                                                             │
│   1081                                                                                           │
│                                                                                                  │
│ /home/ubuntu/miniconda3/envs/BSSOrion/lib/python3.8/site-packages/sire/_load.py:116 in           │
│ _resolve_path                                                                                    │
│                                                                                                  │
│   113 │   if hasattr(directory, "strpath"):                                                      │
│   114 │   │   directory = directory.strpath                                                      │
│   115 │                                                                                          │
│ ❱ 116 │   if os.path.exists(path) and os.path.isfile(path):                                      │
│   117 │   │   if path.endswith(".gz"):                                                           │
│   118 │   │   │   # unzip the file first                                                         │
│   119 │   │   │   unzipped = path[0:-3]                                                          │
│                                                                                                  │
│ /home/ubuntu/miniconda3/envs/BSSOrion/lib/python3.8/genericpath.py:19 in exists                  │
│                                                                                                  │
│    16 def exists(path):                                                                          │
│    17 │   """Test whether a path exists.  Returns False for broken symbolic links"""             │
│    18 │   try:                                                                                   │
│ ❱  19 │   │   os.stat(path)                                                                      │
│    20 │   except (OSError, ValueError):                                                          │
│    21 │   │   return False                                                                       │
│    22 │   return True                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: stat: path should be string, bytes, os.PathLike or integer, not tuple

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ /home/ubuntu/miniconda3/envs/BSSOrion/lib/python3.8/site-packages/BioSimSpace/IO/_io.py:504 in   │
│ readMolecules                                                                                    │
│                                                                                                  │
│    501 │   │   │   │   else:                                                                     │
│    502 │   │   │   │   │   raise IOError(msg) from None                                          │
│    503 │   │   │   else:                                                                         │
│ ❱  504 │   │   │   │   msg = "Failed to read molecules from: %s" % files                         │
│    505 │   │   │   │   if _isVerbose():                                                          │
│    506 │   │   │   │   │   raise IOError(msg) from e0                                            │
│    507 │   │   │   │   else:                                                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: not all arguments converted during string formatting

Expected behavior
Loads

Input files

Archive.zip

(please complete the following information):

  • OS: Linux
  • Version of Python: 3.8
  • Version of BioSimSpace: main
  • I confirm that I have checked this bug still exists in the latest released version of BioSimSpace: [yes/no]

Additional context
Add any other context about the problem here.

[BUG] Unable to save BSS mol as PDB

Describe the bug

To Reproduce
Something like this to create PDB/TOP files:

a_sdf = 'ejm31.sdf'
mol = BSS.IO.readMolecules(a_sdf)[0]
mol = BSS.Parameters.gaff2(mol, charge_method="BCC").getMolecule()
out = BSS.IO.saveMolecules('/mnt/block-volume/notebooks/hybrid_gen_tests/', mol, ["PDB", "GroTop"])

OSError: Failed to save system to format: 'PDB'

Input files
https://drive.google.com/file/d/1TQTtNXS081vviCqa8cwLjy3sX57UARSh/view?usp=sharing

(please complete the following information):

  • OS: Linus 22.04
  • Version of Python: 3.9
  • Version of BioSimSpace: '2023.2.2'
  • Yes i think this is the latest version

Additional context
I've also been having issues reading in previously generated GRO and TOP files.

[BUG] GROMACS 2023.1 no longer uses maxwarn -1 for unlimited warnings

Previously, gmx grompp would except maxwarn -1 to allow unlimited warnings. For version 2023.1, this isn't the case and it will return an error, i.e.:

Inconsistency in user input:
Max number of warnings need to be a positive integer

In order to fix this in a backwards compatible way, I propose to use a large number, e.g. maxwarn 9999, which is what we used to do. Annoyingly this change wasn't documented anywhere in the GROMACS CHANGELOG, so wasn't spotted until a version of Sire was built against a 2023 GROMACS package.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.