lujiarui / str2str Goto Github PK

Codebase of the paper "Str2Str: A Score-based Framework for Zero-shot Protein Conformation Sampling" (ICLR 2024)

License: MIT License

Makefile 0.17% Python 99.26% Shell 0.57%

computational-biology conformation-generation diffusion-models generative-models molecular-dynamics molecular-simulation pytorch

str2str's People

Contributors

Stargazers

Watchers

Forkers

jinzhuwei

str2str's Issues

Is there any recommended configuration for training Str2Str?

Hi, I am trying to retrain Str2Str, but found a huge consumption of GPU memory, even if batch_size=1. I am asking whether there is a recommended hardware configuration for training Str2str models? Thank you~

tica plot is not same as Figure S4

Hello Jiarui,
Recently i am using mdtraj to extract 1000 frame as reference, and using sampling 1000 frame for all 12 fast folding protein. Specificly, I am using interval to make microseconds MD data to 1000 frame. But the TICA plot is not even close， I used default eval.py and metrics.py， i am very confused about the reason for this results. Can you offer some help to this results? Thanks.
metrics_dev_0318-05-27.csv

Errors in using pretrained models

Hi! I am trying to predict protein ensembles by using the pretrained model offered on Google Drive (named pretrain.pth). However, something seems to be wrong when loading the state dict.

my command: python eval.py task_name=inference target_dir=null ckpt_path=/home/jyzha/software/Str2Str/data/pretrain.pth

the output and error after printing config:

[2024-03-07 13:48:24,178][main][INFO] - [rank: 0] Instantiating datamodule <src.data.protein_datamodule.ProteinDataModule>
[2024-03-07 13:48:24,187][main][INFO] - [rank: 0] Instantiating model <src.models.diffusion_module.DiffusionLitModule>
/home/jyzha/software/anaconda3/envs/str2str/lib/python3.9/site-packages/torch/nn/modules/transformer.py:286: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
warnings.warn(f"enable_nested_tensor is True, but self.use_nested_tensor is False because {why_not_sparsity_fast_path}")
/home/jyzha/software/anaconda3/envs/str2str/lib/python3.9/site-packages/lightning/pytorch/utilities/parsing.py:199: Attribute 'net' is an instance of nn.Module and is already saved during checkpointing. It is recommended to ignore them using self.save_hyperparameters(ignore=['net']).
[2024-03-07 13:48:28,277][main][INFO] - [rank: 0] Instantiating loggers...
[2024-03-07 13:48:28,278][src.utils.instantiators][WARNING] - [rank: 0] No logger configs found! Skipping...
[2024-03-07 13:48:28,278][main][INFO] - [rank: 0] Instantiating trainer <lightning.pytorch.trainer.Trainer>
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
[2024-03-07 13:48:28,512][src.utils.utils][ERROR] - [rank: 0]
Traceback (most recent call last):
File "/home/jyzha/software/Str2Str/src/utils/utils.py", line 68, in wrap
metric_dict, object_dict = task_func(cfg=cfg)
File "/home/jyzha/software/Str2Str/src/eval.py", line 144, in evaluate
model, ckpt_path = checkpoint_utils.load_model_checkpoint(model, cfg.ckpt_path)
File "/home/jyzha/software/Str2Str/src/utils/checkpoint_utils.py", line 19, in load_model_checkpoint
model.net.load_state_dict(net_params)
File "/home/jyzha/software/anaconda3/envs/str2str/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DenoisingNet:
Unexpected key(s) in state_dict: "translator.trunk.ipa_0.linear_rbf.weight", "translator.trunk.ipa_0.linear_rbf.bias", "translator.trunk.ipa_1.linear_rbf.weight", "translator.trunk.ipa_1.linear_rbf.bias", "translator.trunk.ipa_2.linear_rbf.weight", "translator.trunk.ipa_2.linear_rbf.bias", "translator.trunk.ipa_3.linear_rbf.weight", "translator.trunk.ipa_3.linear_rbf.bias".
[2024-03-07 13:48:28,516][src.utils.utils][INFO] - [rank: 0] Output dir: /home/jyzha/software/Str2Str/logs/inference/runs/2024-03-07_13-48-23
Error executing job with overrides: ['task_name=inference', 'target_dir=null', 'ckpt_path=/home/jyzha/software/Str2Str/data/pretrain.pth']
Traceback (most recent call last):
File "/home/jyzha/software/Str2Str/src/eval.py", line 173, in main
evaluate(cfg)
File "/home/jyzha/software/Str2Str/src/utils/utils.py", line 78, in wrap
raise ex
File "/home/jyzha/software/Str2Str/src/utils/utils.py", line 68, in wrap
metric_dict, object_dict = task_func(cfg=cfg)
File "/home/jyzha/software/Str2Str/src/eval.py", line 144, in evaluate
model, ckpt_path = checkpoint_utils.load_model_checkpoint(model, cfg.ckpt_path)
File "/home/jyzha/software/Str2Str/src/utils/checkpoint_utils.py", line 19, in load_model_checkpoint
model.net.load_state_dict(net_params)
File "/home/jyzha/software/anaconda3/envs/str2str/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DenoisingNet:
Unexpected key(s) in state_dict: "translator.trunk.ipa_0.linear_rbf.weight", "translator.trunk.ipa_0.linear_rbf.bias", "translator.trunk.ipa_1.linear_rbf.weight", "translator.trunk.ipa_1.linear_rbf.bias", "translator.trunk.ipa_2.linear_rbf.weight", "translator.trunk.ipa_2.linear_rbf.bias", "translator.trunk.ipa_3.linear_rbf.weight", "translator.trunk.ipa_3.linear_rbf.bias".
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

I am looking for the solution to fix it and very thanks.

torch.cuda.OutOfMemoryError to infer for a large protein

Hi, when I have attempted to infer (not train) a protein with a sequence length of 525 using Str2Str, a memory error occurred as following:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.19 GiB. GPU 0 has a total capacity of 44.40 GiB of which 4.20 GiB is free. Including non-PyTorch memory, this process has 40.20 GiB memory in use. Of the allocated memory 39.85 GiB is allocated by PyTorch, and 40.98 MiB is reserved by PyTorch but unallocated.
The code was conducted on the NVIDIA A40-Xeon-48GB GPU. Is there any way that I can successfully infer this protein using Str2Str within the constraints of this computing resources? Looking forward to your reply, Thanks.

src install error

Hi jiarui,
I am trying to create the env providede in env.yaml, but get a error when install src=0.0.1

pip install src==0.0.1
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting src==0.0.1
Downloading src-0.0.1.zip (3.4 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/tmp/pip-install-6vdmjax1/src_62c18340b1e54ae7a883b44d67a112fc/setup.py", line 14, in
long_description = '\n\n'.join([open(f).read() for f in [
File "/tmp/pip-install-6vdmjax1/src_62c18340b1e54ae7a883b44d67a112fc/setup.py", line 14, in
long_description = '\n\n'.join([open(f).read() for f in [
FileNotFoundError: [Errno 2] No such file or directory: 'README.rst'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Do you meet this error, or know some solution to fixed this problem? Thanks

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.