ur-whitelab / hoomd-tf Goto Github PK

View Code? Open in Web Editor NEW

30.0 30.0 8.0 59.71 MB

A plugin that allows the use of Tensorflow in Hoomd-Blue for GPU-accelerated ML+MD

Home Page: https://hoomd-tf.readthedocs.io

License: MIT License

CMake 5.49% C++ 22.21% Cuda 4.69% Python 63.54% Makefile 0.23% CSS 2.91% HTML 0.94%

machine-learning molecular-dynamics tensorflow

hoomd-tf's People

Contributors

Stargazers

Watchers

Forkers

malramsay64 oktak darmis007 krishna999 qpwm96 sscake erjank gmaikelc

hoomd-tf's Issues

Add documentation for eds_bias method

The eds_bias method in utils.py has little documentation. Add a docstring and leading comments in the same style as the rest of the methods.

Change c++ local variable syntax

hoomd requires local variables to be prefixed with m_ instead of _ for certain macros to function. We should change over all the local variables to match this syntax.

Make Python packages actual dependencies

Add changelog

Create a changelog file with release goals and notes.

Currently, hoomd-tf logs quite a bit. It would make sense to only output the TF items to the tf_manager.log and also respect the log-level set in hoomd. We should also go through to make sure the correct log level is being used, especially warnings.

Add support for applying only for groups

One idea is to use masks, but that will not work if particles are sorted or number of particles changes. Perhaps we can trigger an update to masks? Could this be affected by Issue #7?

Add examples to Coarse-Graining Utilities section of README

re: GPU

How did you guys install GPU support for hoomd?

I know this is unrelated to your project, but I figured you must have ran into this issue.

I tried compiling it and get this at the end:

HOOMD-blue v2.8.1-26-gef08be936 CUDA (10.2) DOUBLE HPMC_MIXED MPI SSE SSE2 SSE3 SSE4_1 SSE4_2 AVX AVX2 
Compiled: 11/26/2019
Copyright (c) 2009-2019 The Regents of the University of Michigan.
-----

system has unsupported display driver / cuda driver combination
HOOMD-blue is running on the CPU
<hoomd.context.SimulationContext object at 0x7f2334d383c8>

Sorting should be disabled for molecular batching

We should actively try to disable particle sorting when attaching/initializing

Standardize Name

Need to consistently choose among hoomd-tf, HOOMD-TF, hoomd_tf, and tensorflow_plugin in all places.

Make XLA on by default

XLA speeds-up and seems to have no downsides. Any thoughts @RainierBarrett?

Tutorial Notebooks

Now that we have the ability to create/run models in a single python script, we should make a folder with some example jupyter notebooks. A great start would be the quickstart from the README.

Set-up MPI/Multi-GPU unit testing

The type field in positions/nlist is incorrect depending on GPU/CPU choice

The scalars weren't being extracted correctly in C++ code.

Working on PR:

position type on CPU
position type on GPU
nlist type on CPU
nlist type on GPU

Add Keras README Examples

Add some examples to README about how to use Keras with hoomd-tf, including note that model.compile and model.fit do not work.

Make it possible to disable hoomd-tf compute in script

Would like to be able to run tfcompute.disable() and have it cleanly shutdown the TF instance.

Add documentation on how to use variables saved by out_nodes

Possibility to use forces directly with hoomd-tf

I am new to hoomd-blue and hoomd-tf, and I am learning both at the same time,
I have a tensorflow mdoel that I want to turn into a graphe. My model gets the position of atoms within a cut-off and give back the forces and not the potential. Is it possible to do this with hoomd-tf? For example a modification like below works or not?
Let's say the force is defined as r/norm(r)2, r is the vector between atom i and j,
f_vec = norm(f) * e_r, norm(f) = 1/norm(r)

graph = htf.graph_builder(64) # max neighbors = 64
rinv = graph.nlist_rinv
f = my_tf_model(rinv)
forces = tf.reduce_sum( tf.multiply(f *r, rinv ), axis=-1)# there is mismatch between shape but lets say we solve it, (r is a vector )
graph.save('my_model', forces)

Swtich to using hoomd-blue cuda error codes

Migrate documentation from README to sphinx

We should move from a monolithic README to putting the doc into the python source files so that a documentation website can be generated. Related to #43.

Review README to ensure syntax is up to date

Integrate with conda install

Test out installing hoomd-blue with conda, tensorflow with pip, and see if plugin can be made without compiling hoomd-blue.

Converting gromacs all-atom trajectory to a hooomd gsd file

Need a thorough documentation and a python script to convert an all-atom trajectory to a hoomd trajectory file.

Batching using datasets from tensor slices

Very large systems cannot fit a complex NN model into memory, so it may be necessary to batch positions/nlist for execution.

Test for reverse_mol_index failing

test_reverse_mol_index in test_utils.py is failing with the following error:

__________________________________ test_mol_batching.test_reverse_mol_index ___________________________________

self = <test_tensorflow.test_mol_batching testMethod=test_reverse_mol_index>

    def test_reverse_mol_index(self):
        # need this for logging
        hoomd.context.initialize()
        mi = [[1, 2, 0, 0, 0], [3, 0, 0, 0, 0], [4, 5, 7, 8, 9]]
        rmi = hoomd.htf._make_reverse_indices(mi)
        # should be
        rmi_ref = [
            [0, 0],
            [0, 1],
            [1, 0],
            [2, 0],
            [2, 1],
            [-1, -1],
            [2, 2],
            [2, 3],
            [2, 4]
        ]
>       self.assertEqual(rmi, rmi_ref)
E       AssertionError: Lists differ: [[0, 0], [0, 1], [1, 0], [2, 0], [2, 1], [-1, -1], [2, 2], [2, 3], [2, 4], []] != [[0, 0], [0, 1], [1, 0], [2, 0], [2, 1], [-1, -1], [2, 2], [2, 3], [2, 4]]
E       
E       First list contains 1 additional elements.
E       First extra element 9:
E       []
E       
E       - [[0, 0], [0, 1], [1, 0], [2, 0], [2, 1], [-1, -1], [2, 2], [2, 3], [2, 4], []]
E       ?                                                                          --- -
E       
E       + [[0, 0], [0, 1], [1, 0], [2, 0], [2, 1], [-1, -1], [2, 2], [2, 3], [2, 4]]

test-py/test_tensorflow.py:527: AssertionError
-------------------------------------------- Captured stdout call ---------------------------------------------
Not all of your atoms are in a molecule

Remove tasklock code

It is no longer necessary with change to queues.

Put hoomd-tf description on whitelab software page

Make code consistent for hoomd-blue plugin

See guidelines for hoomd

Add version string to init.py

Modify the __init__.py file to add a __version__ variable so that people can query htf.__version__ to get the current version.

Add example use of eds_bias method

The eds_bias method in utils.py is not used anywhere in the example_models dir – it appears to be manually implemented instead. The EDS example model directory should show an example use case of this method.

Molecular batching: Particles are not in the same image

Periodic boundary conditions must be accounted

Continuous Integration Testing with Dockerfile

Use GCP to build and run tests on PRs

Fix Code Examples

The example code is not very convenient to use, as it expects, for example /scratch/rbarret8/. Some of them use outdated syntax, such as feed_func instead of feed_dict. Update so they all take the directory to save/load from as input.

Replace old variable syntax with new running mean

Some example code uses custom code instead of new plugin running mean code. Replace it.

Fix bug that causes hanging depending on out_nodes list order

If you give a printer node to the out_nodes arg in tensorflow_plugin.graph_builder.graph.save() after the optimizer node, the model will build, but it hangs on the first step of training. Change the code so that it doesn't hang depending on the order of elements in out_nodes.

You can replicate this by running the attached build_test_model.py file with printer after optimizer in the out_nodes list on the last line, then running run_train.py.

build_test_model.py

import tensorflow as tf
import hoomd.tensorflow_plugin as htf

NN = 63
graph = htf.graph_builder(NN, output_forces=False)
r_inv = graph.nlist_rinv
input_tensor = tf.reshape(r_inv, shape=(-1,1), name='r_inv')
weight = tf.Variable([1.0])
nn_energies = tf.identity(r_inv) * weight
calculated_energies = tf.reduce_sum(nn_energies, axis=1, name='calculated_energies')
calculated_forces = graph.compute_forces(calculated_energies)
cost = tf.losses.mean_squared_error(calculated_forces, graph.forces)
printer = tf.Print(cost, [cost], message='cost is: ')
optimizer = tf.train.AdamOptimizer(0.001).minimize(cost)
#change your model_directory as needed
graph.save(model_directory='/scratch/rbarret8/test_model/', out_nodes=[printer, optimizer])

run_train.py

import hoomd, hoomd.md, hoomd.dump, hoomd.group, hoomd.benchmark
import numpy as np
from hoomd.tensorflow_plugin import tfcompute
import tensorflow as tf
from math import sqrt
from sys import argv as argv
import time

if(len(argv) != 2):
    print('Usage: basic_ann_ff.py [N_PARTICLES]')
    exit(0)

N = int(argv[1])


model_dir = '/scratch/rbarret8/test_model/'#change this as needed

np.random.seed(42)


with hoomd.tensorflow_plugin.tfcompute(model_dir, _mock_mode=False, write_tensorboard=True) as tfcompute:
    hoomd.context.initialize('--mode=gpu')
    rcut = 3.0
    sqrt_N = int(sqrt(N))#Make sure N is a perfect square
    
    system = hoomd.init.create_lattice(unitcell=hoomd.lattice.sq(a=2.0),
                                       n=[sqrt_N, sqrt_N])
    nlist = hoomd.md.nlist.cell(check_period = 1)
    lj = hoomd.md.pair.lj(rcut, nlist)#basic LJ forces from HOOMD
    lj.pair_coeff.set('A', 'A', epsilon=1.0, sigma=1.0)
    hoomd.md.integrate.mode_standard(dt=0.005)
    hoomd.md.integrate.langevin(group=hoomd.group.all(), kT=1.0, seed=42)
    #equilibrate for 4k steps first
    hoomd.run(4000)
    #now attach the trainable model
    tfcompute.attach(nlist, r_cut=rcut, save_period=10, period=100)
    #train on 5k timesteps
    hoomd.run(5000)

MPI CG Issue

The current way which coarse-grain mapping is done is by relying on the exact atom ordering. This is an issue when doing domain decomposition since atoms can leave and enter the simulation domain. This also is an issue if the particle sorter is not disabled. Might be related to Issue #6

Cache mol index

Because we transfer the mol index each step to tfmanager, it doesn't know if it can cache mol index or not. We should also pass a flag, to prevent triggering recomputing all the graph opsthat rely on mol index.

Make conda release

Can we make a proper conda release that people can install?

Make Python code consistent for hoomd-blue plugin

Like the C++ code, add doxygen comment strings to all the .py files. See Hoomd's guide to style.

Improve unit testing

Add C++ tests
Add coverage analysis
Add timing tests for assessing regression in performance

Variables need to be non-trainable

Recent changes have added new variables for computing means and RDFs. These should be marked as not trainable.

Checkpoint Convienence

If not all variables are set when bootstrapping, tensorflow fails. Let's make it so you can load some and initialize others.

Decide and add license

Tutorial Notebook on CG Mapping

compute_pairwise_potential improvements

We should add documentation about compute_pairwise_potential to the README in the utilities section. The method should also be improved to return energy and forces, rather than just energy. This could be used to write-out learned potentials into tabular form for use in subsequent simulations.

should be

dist_mat_r = dist * mask_cast + (1 - mask_cast) * 1000
topk = tf.math.top_k(-dist_mat_r, k=NN, sorted=sorted)

because otherwise we're getting the largest distances instead of smallest. Why did this work previously? Maybe extra negative sign in the PBC op?

Remove tasklock from python

Now it is truly not needed with #47

Add config options and test XLA benchmarks

XLA should drastically help for CG mapping operators code and other multistep complex code