atomicarchitects / equiformer_v2 Goto Github PK

[ICLR'24] EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations

Home Page: https://arxiv.org/abs/2306.12059

License: MIT License

Python 99.47% Shell 0.53%

catalyst-design computational-chemistry computational-physics deep-learning drug-discovery equivariant-graph-neural-network equivariant-neural-networks force-fields interatomic-potentials machine-learning

equiformer_v2's Introduction

EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations

Paper | OpenReview | Poster

This repository contains the official PyTorch implementation of the work "EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations" (ICLR 2024). We provide the code for training the base model setting on the OC20 S2EF-2M and S2EF-All+MD datasets.

Additionally, EquiformerV2 has been incorporated into OCP repository and used in Open Catalyst demo.

In our subsequent work, we find that we can generalize self-supervised learning similar to BERT, which we call DeNS (Denoising Non-Equilibrium Structures), to 3D atomistic systems to improve the performance of EquiformerV2 on energy and force predictions. Please refer to the paper and the code for further details.

Environment Setup

Environment

See here for setting up the environment.

OC20

The OC20 S2EF dataset can be downloaded by following instructions in their GitHub repository.

For example, we can download the OC20 S2EF-2M dataset by running:

    cd ocp
    python scripts/download_data.py --task s2ef --split "2M" --num-workers 8 --ref-energy

We also need to download the "val_id" data split to run training.

After downloading, place the datasets under datasets/oc20/ by using ln -s:

    cd datasets
    mkdir oc20
    cd oc20
    ln -s ~/ocp/data/s2ef s2ef

To train on different splits like All and All+MD, we can follow the same link above to download the datasets.

Changelog

Please refer to here.

Training

OC20

We train EquiformerV2 on the OC20 S2EF-2M dataset by running:
```
    sh scripts/train/oc20/s2ef/equiformer_v2/equiformer_v2_N@12_L@6_M@2_splits@[email protected]
```
The above script uses 2 nodes with 8 GPUs on each node.

If there is an import error, it is possible that ocp/ocpmodels/common/utils.py is not modified. Please follow here for details.

We can also run training on 8 GPUs on 1 node:
```
    sh scripts/train/oc20/s2ef/equiformer_v2/equiformer_v2_N@12_L@6_M@2_splits@[email protected]
```
We train EquiformerV2 (153M) on OC20 S2EF-All+MD by running:
```
    sh scripts/train/oc20/s2ef/equiformer_v2/equiformer_v2_N@20_L@6_M@3_splits@[email protected]
```
The above script uses 16 nodes with 8 GPUs on each node.
We train EquiformerV2 (31M) on OC20 S2EF-All+MD by running:
```
    sh scripts/train/oc20/s2ef/equiformer_v2/equiformer_v2_N@8_L@4_M@2_splits@[email protected]
```
The above script uses 8 nodes with 8 GPUs on each node.
We can train EquiformerV2 with DeNS (Denoising Non-Equilibrium Structures) as an auxiliary task to further improve the performance on energy and force predictions. Please refer to the code for details.

File Structure

nets includes code of different network architectures for OC20.
scripts includes scripts for training models on OC20.
main_oc20.py is the code for training, evaluating and running relaxation.
oc20/trainer contains code for the force trainer as well as some utility functions.
oc20/configs contains config files for S2EF.

Checkpoints

We provide the checkpoints of EquiformerV2 trained on S2EF-2M dataset for 30 epochs, EquiformerV2 (31M) trained on S2EF-All+MD, and EquiformerV2 (153M) trained on S2EF-All+MD.

Model	Split	Download	val force MAE (meV / Å)	val energy MAE (meV)
EquiformerV2	2M	checkpoint \| config	19.4	278
EquiformerV2 (31M)	All+MD	checkpoint \| config	16.3	232
EquiformerV2 (153M)	All+MD	checkpoint \| config	15.0	227

Citation

Please consider citing the works below if this repository is helpful:

EquiformerV2:

@inproceedings{
    equiformer_v2,
    title={{EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations}}, 
    author={Yi-Lun Liao and Brandon Wood and Abhishek Das* and Tess Smidt*},
    booktitle={International Conference on Learning Representations (ICLR)},
    year={2024},
    url={https://openreview.net/forum?id=mCOBKZmrzD}
}

eSCN:

@inproceedings{
    escn,
    title={{Reducing SO(3) Convolutions to SO(2) for Efficient Equivariant GNNs}},
    author={Passaro, Saro and Zitnick, C Lawrence},
    booktitle={International Conference on Machine Learning (ICML)},
    year={2023}
}

Equiformer:

@inproceedings{
    equiformer,
    title={{Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs}},
    author={Yi-Lun Liao and Tess Smidt},
    booktitle={International Conference on Learning Representations (ICLR)},
    year={2023},
    url={https://openreview.net/forum?id=KwmPfARgOTD}
}

OC20 dataset:

@article{
    oc20,
    author = {Chanussot*, Lowik and Das*, Abhishek and Goyal*, Siddharth and Lavril*, Thibaut and Shuaibi*, Muhammed and Riviere, Morgane and Tran, Kevin and Heras-Domingo, Javier and Ho, Caleb and Hu, Weihua and Palizhati, Aini and Sriram, Anuroop and Wood, Brandon and Yoon, Junwoong and Parikh, Devi and Zitnick, C. Lawrence and Ulissi, Zachary},
    title = {{Open Catalyst 2020 (OC20) Dataset and Community Challenges}},
    journal = {ACS Catalysis},
    year = {2021},
    doi = {10.1021/acscatal.0c04525},
}

Please direct questions to Yi-Lun Liao ([email protected]).

Acknowledgement

Our implementation is based on PyTorch, PyG, e3nn, timm, ocp, Equiformer.

equiformer_v2's People

Contributors

Stargazers

Watchers

equiformer_v2's Issues

Small equivariant example

Hi @yilunliao,

Thanks for the nice codebase - I am adapting it for another purpose, and I was running into some issues when checking the outputs are actually equivariant. Are there any init flags that must be set in a certain way to guarantee equivariance?

I have a snippet equivalent to this:

import torch_geometric
import torch
from e3nn import o3
from torch_geometric.data import Data
from nets.equiformer_v2.equiformer_v2_oc20 import EquiformerV2_OC20

edge_index = torch.tensor([[0, 1, 1, 2],
                           [1, 0, 2, 1]], dtype=torch.long)
pos = torch.randn(10, 3)
data = Data(pos=pos, edge_index=edge_index)

R = torch.tensor(o3.rand_matrix())

model = EquiformerV2_OC20(
        num_layers=2,
        attn_hidden_channels=16,
        ffn_hidden_channels=16,
        sphere_channels=16,
        edge_channels=16,
        alpha_drop=0.0, # Turn off dropout for eq
        drop_path_rate=0.0, # Turn off drop path for eq
    )

energy1, forces1 = model(data)
rotated_pos = torch.matmul(pos, R)
data.pos = rotated_pos
energy2, forces2 = model(data)

assert energy1 == energy2
assert torch.allclose(forces1, torch.matmul(forces2, R), atol=1.0e-3)

and the energies are equal, but the forces do not obey equality under rotation. I've turned off all dropout and set the model to eval - just wondering if there are any other tricks to retain the genuine eq behaviour. Thanks!

Multi-node multi-gpu training

Could you provide instructions on how to run experiments under the multi-node multi-gpu setting without using the submitit? For example, I have 2 nodes, each of which contains 16 gpus. How should I modify the scripts you provide to reproduce the reported results?

Thanks!

e3nn tensors compatibility issue

Hi, I am trying to integrate this with the e3nn package.

For the SO3Embedding class, how can I convert that to an irrep which is compatible with the convention e3nn?
My implementation (not sure this is right or not)

    def to_e3nn_embeddings(self):
        from e3nn.io import SphericalTensor
        from e3nn.o3 import Irreps
        embedding = self.embedding.reshape(self.length, -1)

        l = o3.Irreps(str(SphericalTensor(self.lmax_list[-1], 1, -1)).replace('1x', f'{self.num_channels}x'))
        # multiple channels
        return l, embedding

Can this model be used for molecule generation?

Hi,

I've heard of this strong model which can learn atomic coordinates. Now I want to adapt this model for my project, but I find the code is a bit complicated and hard to follow. (The paper also contains some concepts hard to understand)

I want to make sure, if this model can learn the following task:

**INPUTS**: protein_atom_pos, protein_atom_types, init_molecule_pos, init_molecule_types
**OUTPUTS**: molecule_pos, molecule_types

In another word, I want to learn the molecule given protein pocket, known as Structure-based Drug Design.

Please help me verify if this model suitable for SE(3)-invariant molecule learning, and indicate the relavant code pieces.

Thanks!

Question on eSCN

Hello, I have a question about eSCN.
I have read the section "A.3 ESCN CONVOLUTION" of your paper, and I wonder is there an implicit condition for the summation running the irreps order L_i and L_f? More specifically, assuming tensor product "1ox(0e+1o+2e)", I want to consider "1o" as output irreps. So, how does equiformerv2 distinguish the odd irreps "1o" that comes from "1ox(0e+2e)" and even irreps "1e" that comes from "1ox1o"?

Inquiry About rescale_factor in SO3.py

Dear Developer,

I am writing to inquire about the rescale_factor = math.sqrt(length / (2 * mmax + 1)) defined in the SO3_Grid and CoefficientMappingModule classes within SO3.py. Could you please explain its purpose? Additionally, would omitting rescale_factor affect the network's equivariance?

Thank you for your assistance.

Best regards,

Yang Zhong

Speed compared to `TorchMD-Net`

I wanted to train this network on the spice dataset (similar task where I want to predict forces and energy from structure). I was comparing speed of training with torchmd-net (https://github.com/torchmd/torchmd-net). For same parameter count, torchmd-net is at least thrice faster as compared to Equiformer and twice as compared to Equiformer-v2. Is this something common or a bug on my end?

Upload models to the Hugging Face Hub

Hi!

Very cool work! It would be nice to have the model checkpoints on the Hugging Face Hub rather than a Dropbox link

Some of the benefits of sharing your models through the Hub would be:

versioning, commit history and diffs
repos provide useful metadata about their tasks, languages, metrics, etc that make them discoverable
multiple features from TensorBoard visualizations, PapersWithCode integration, and more
wider reach of your work to the ecosystem

Creating the repos and adding new models should be a relatively straightforward process if you've used Git before. This is a step-by-step guide explaining the process in case you're interested. Please let us know if you would be interested and if you have any questions.

How to use the v2 to train IS2RE datasets?

As the title states，How to use the v2 to train IS2RE datasets? I think I need some tutorials.

some questions

Hi , not issues but some questions.

Any comparison of the performace to some other SOTA equvariant nets such as MACE or Nequip or sth ?
Is MD simulation available ? or any developments going on?

Thanks for the nice work

Can equiformer predict force using energy gradients?

I looked into codes and the paper, and it seems like equiformer predict force directly, rather than using gradient of energy. Although there is pros and cons, I'm curious whether is there any plan to making force predictions from energy gradient, which could be implemented using autograd.

How to train on a customied dataset?

I have a dataset with molcules and their atoms, positions, energy, and forces information. I wonder how to train my dataset with equiformer_v2. Is it possible to support customied dataset training or at least give some tutorial or suggestion?

Question about the edge_rot_mat

Hi, thanks for your wonderful work and code! I am new to the equivariant learning area, and I am trying to understand each step of your code. I suspect this function (the function for calculating the edge rotation matrix) would break the equivariance during the edge-degree embedding layer.

In this function, you set the original edge_distance_vec as the final x_axis, and randomly select the y-axis and generate the z_axis according to the other two axes. I think this random operation is not correct.

For example, suppose I send two same molecules into the network, where the coordinates of one molecule are translated relative to the other molecule, but not rotated. If the network is SE(3)-equivariant, the final feature should be the same for these two molecules. However, during the above function, they will have different edge rotation matrices on the same edge since the randomly selected y_axis, and the corresponding Wigner-D matrix would be different, thus during the edge-degree embedding layer, the embedding of the two molecules are different since this step uses the Wigner-D matrix. This breaks the equivariance for all features with type>0.

I am not sure if I am correct. Looking forward to your opinion on this issue.

Incorporating vector node features

Hi, nice work! I was wondering what it would take to accommodate for systems with nodes that have additional non-scalar features. Any hints or snippets would be greatly appreciated. Thanks.

Large performance gap in MD17/22 dataset

Thank you for the great work EquiformerV2. When I test its performance on MD17/22 dataset, I find it lags far behind SOTA models like VisNet. For example, in MD22_AT_AT, when VisNet val loss for E converges to 0.14, F converges to 0.17. While for EqV2 E val loss is 4.7 for E and 5.1 for F.
I follow the setting in oc20/configs/s2ef/all_md/equiformer_v2/equiformer_v2_N@8_L@4_M@2_31M.yml. Are there things I need to modify for adopting EqV2 in MD datasets? Thanks.

Training logs of QM9 dataset for EquiformerV2

Hello, thank you for sharing your great work!

I was wondering if it would be possible for you to share the training logs, similar to how the logs were provided for Equiformer. I'm specifically interested in seeing the higher precision test results. For example, for mu target 0, the reported results in the paper are limited to 0.11, but the Equiformer logs show higher precision results of 0.1172.

If you're able to share the logs, I would greatly appreciate it as it would allow for a more detailed analysis of the impressive results you achieved.

Thank you again for this excellent contribution to the field. I look forward to hearing back from you @yilunliao.

Best regards,

Using Equiformer for Encoding Protein 3D Structure

Thanks for putting together this awesome repo!

I'm working on a VQVAE model for the backbone atoms of protein 3D structures and thinking about using Equiformer V2 as my first part the encoder. I used the original SE(3) model before, but it was super computationally heavy.

My plan is to quantize protein 3D structures residue-wise based on the number of carbon alpha atoms and only pass the coordinates to the encoder, leaving out the amino acid type info.

Do you have any suggestions or thoughts on how well this setup might work?

atomicarchitects / equiformer_v2 Goto Github PK

equiformer_v2's Introduction

EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations

Content

Environment Setup

Environment

OC20

Changelog

Training

OC20

File Structure

Checkpoints

Citation

Acknowledgement

equiformer_v2's People

Contributors

Stargazers

Watchers

Forkers

equiformer_v2's Issues

Recommend Projects

Recommend Topics

Recommend Org