aqlaboratory / openfold Goto Github PK
View Code? Open in Web Editor NEWTrainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2
License: Apache License 2.0
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2
License: Apache License 2.0
line 131 in train_openfold.py , data_module.prepare_data error, there is no prepare_data in OpenFoldDataModule class, Is it missing?
Hi, will you be releasing the parameters on a non-academic license as well ? Or do we have to train it from scratch?
Why do I report this error when I specify the validation set during training: AttributeError:'OpenFoldWrapper' object has no attribute'cached_weights'
Hi,
Thanks for a great repo!
I'm confused why the template's amino acids and the msa's amino acids are modified again using rc.MAP_HHBLITS_AATYPE_TO_OUR_AATYPE.
It seems like we read the amino acids from the pdb structure and convert them to ids using HHBLITS_AA_TO_ID. I'm wondering why we need to modify them again?
Our recent OpenMM 7.6 release included some namespace changes that look liked they required pinning this repo to OpenMM 7.5.1.
Is it OK with you folks if we propose a pull request that should enable compatibility with OpenMM 7.6 and later (ideally without breaking backwards compatibility as well)?
Thanks so much for an excellent repo!
I'm trying to weigh all of the options for acquiring MSAs in order to train the model. I could either 1) use trrosetta's MSAs, 2) Use Protein Net's MSAs, or 3) Make MSAs myself using MMSeqs2. Do you potentially know how these options compare and how long 3) would take?
Thanks!
I need to precompute protein alignments before training the model, but I didn't find the path mmcif_dir/.
Checkpoints are not saved after validation epoch ends. checkpoint_best_val is active
Validation loss is not shown during validation, maybe this is connected? (since it's supposed to track val_loss)
The chunk_layer
function in openfold/utils/tensor_utils.py
, which implements the "chunking" procedure described in subsection 1.11.8 of the Alphafold 2 supplement, relies on a memory-expensive expand/reshape operation at the top to standardize the batch dimensions of input tensors. This operation can be a bottleneck during inference, so some optimization here would do wonders.
First of all, great work!
I'm wondering what training times I can expect for a single target. I'm currently at 1min/ it (sample) which seems too slow (v100s with fp16 and deepspeed activated, crop size 256). The official implementation takes around 20sec for a comparable sample (single GPU, about 16s with an A100). Haven't tested how much of an overhead is introduced by deepspeed. Gradient accumulation should help to reduce this.
Is it actually possible to train batch size > 1 on a single GPU? I'm assuming it would work with fixed_size=True. I just vaguely remember that they did some dimensionality juggling with the template/ recycling dimensions which might interfere.
Thanks!
Hello everyone. I am doing some evaluation jobs of the inference pipeline. I am wondering how to evaluate the result pdb file such like TM-score, for those proteins that CASP14 doesn't provide the remarking pdb file.
Hi,
When I'm trying to train the model and running this command :
python /data/openfold/train_openfold.py /home/ubuntu/train_mmcif_Dec29_2021/ //home/ubuntu/ProteinNet_parsed/ProteinNet_MSA/ /data/af_databases/pdb_mmcif/mmcif_files/ /home/ubuntu/OF_train_from_ProteinNet_try1_Dec29_20210/ 2021-10-10 --template_release_dates_cache_path /data/af_databases/pdb_mmcif/mmcif_cache.json --precision 16 --replace_sampler_ddp=True --deepspeed_config_path /data/deepspeed_config.json --resume_from_ckpt ckpt_dir/ --gpus 1 --precision 16 --seed 44
I get this error -
"Module 'Attention' has no attribute 'linear_g' : "
I'm running it from the conda env (openfold_venv)
Thanks
Oz
in openfold/scripts/prep_mmseqs_dbs.sh
I guess it should be mmseqs tsv2exprofiledb
not mmseqs tar2exprofiledb
Also a bug at line 26: tar --extract --verbose --file="${DOWNLOAD_DIR}/${f}" \
I think it should be tar --extract --verbose --file="${f}" \
Hi,
Thank for the last time it helped me.
However, now I have another error.
After running the training from ProteinNet input:
python /data/openfold/train_openfold.py /data/af_databases/pdb_mmcif/mmcif_files/ /home/ubuntu/ProteinNet_parsed/ProteinNet_lc/ /data/af_databases/pdb_mmcif/mmcif_files/ /home/ubuntu/OF_train_from_Protein_Net/try_1_Dec29_2021/ 2021-10-10 --template_release_dates_cache_path /data/af_databases/pdb_mmcif/mmcif_cache.json --precision 16 --replace_sampler_ddp=True--deepspeed_config /data/deepspeed_config.json --default_root_dir /home/ubuntu/OF_train_from_Protein_Net/try_1_Dec29_2021/ --gpus 1 --seed 44
I got this error:
###############
Epoch 0: 0%| | 0/50939 [00:00<?, ?it/s]Traceback (most recent call last):
File "/data/openfold/train_openfold.py", line 336, in
main(args)
File "/data/openfold/train_openfold.py", line 196, in main
ckpt_path=ckpt_path,
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 736, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1193, in _run
self._dispatch()
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1272, in _dispatch
self.training_type_plugin.start_training(self)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1282, in run_stage
return self._run_train()
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1312, in _run_train
self.fit_loop.run()
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
self.epoch_loop.run(data_fetcher)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 140, in run
self.on_run_start(*args, **kwargs)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 141, in on_run_start
self._dataloader_iter = _update_dataloader_iter(data_fetcher, self.batch_idx + 1)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/loops/utilities.py", line 121, in _update_dataloader_iter
dataloader_iter = enumerate(data_fetcher, batch_idx)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/utilities/fetching.py", line 199, in iter
self.prefetching(self.prefetch_batches)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/utilities/fetching.py", line 258, in prefetching
self._fetch_next_batch()
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/utilities/fetching.py", line 300, in _fetch_next_batch
batch = next(self.dataloader_iter)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/supporters.py", line 536, in next
return self.request_next_batch(self.loader_iters)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/trainer/supporters.py", line 548, in request_next_batch
return apply_to_collection(loader_iters, Iterator, next)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/utilities/apply_func.py", line 92, in apply_to_collection
return function(data, *args, **kwargs)
File "/data/openfold/openfold/data/data_modules.py", line 350, in _batch_prop_gen
for batch in iterator:
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/_utils.py", line 434, in reraise
raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/openfold/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/openfold/openfold/data/data_modules.py", line 178, in getitem
chain_id=chain_id,
File "/data/openfold/openfold/data/data_pipeline.py", line 577, in process_pdb
with open(pdb_path, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/af_databases/pdb_mmcif/mmcif_files/4l6v_9.pdb'
Epoch 0: 0%| | 0/50939 [00:00<?, ?it/s]
Thanks,
Oz
I was a bit surprised that the number of recycling iterations are sampled during validation. This makes different validation epochs less comparable and the progress less smooth. I think eval should mimic predict in this aspect.
max_iters = self.config.common.max_recycling_iters
if(stage_cfg.supervised):
clamp_prob = self.config.supervised.clamp_prob
keyed_probs.append(
("use_clamped_fape", [1 - clamp_prob, clamp_prob])
)
if(self.stage == "train" and self.config.supervised.uniform_recycling):
recycling_probs = [
1. / (max_iters + 1) for _ in range(max_iters + 1)
]
keyed_probs.append(
("no_recycling_iters", recycling_probs)
)
else:
recycling_probs = [
0. for _ in range(max_iters + 1)
]
recycling_probs[-1] = 1.
keyed_probs.append(
("no_recycling_iters", recycling_probs)
)
device: 1 A100 with 40GB memory
cuda: 11.3
Compared with https://github.com/dptech-corp/Uni-Fold, using model_2
setting, and the same data (only use one sample, and use DummyDataLoader
in openfold).
And I follow this issue, #19, disabled clear_cache_between_blocks
and deepspeed
for cpu offload.
The commit I used is c4d9f57
speed per example:
FP32 | FP16 | |
---|---|---|
openfold | 24.5 s | 17 s |
Uni-Fold | 13.25 s | 8.9 s |
Is that expected? any tricks that I can get further speed-up?
Line:
openfold/openfold/data/data_transforms.py
Line 1139 in 03bb003
should not exist. Otherwise most of the features are empty, when there are no templates.
See corresponding line in the original AF2:
Most of the scripts at https://github.com/sokrypton/ColabFold
have a ton of additional flexibility that comes in handy when running AF on de-novo sequences (for which you usually can't generated an MSA) or to do protein-design with.
Can this codebase also be leveraged to:
In some cases, especially for larger crop sizes, intermediate tensors during training grow so large that PyTorch OOM's despite having allocated as little as 60% of available GPU memory. It would be good to carefully the profile the network to identify the worst culprit modules and come up with clean ways to prevent such degenerate tensor allocation.
when i use script_preset_(model_module), code error,
RuntimeError:
'Tensor' object has no attribute or method 'new_ones'.:
File "openfold/openfold/model/msa.py", line 118
if mask is None:
# [*, N_seq, N_res]
mask = m.new_ones(
~~~~~~~~~~ <--- HERE
m.shape[:-3] + (n_seq, n_res),
)
Hey epic work!
Could you post a Dockerfile
for training/inference?
Thanks!
Well done!I am quitly wondering that using 4 TITAN 2080 with 12G, can i train this model? will i meet the error on out of the memory?
For training use ColabFold pipeline (and templates with HHsearch), there is a path template_mmcif_dir.
Should it be something like data/pdb_mmcif/mmcif_files/ or other precomputed folders?
Hi, Iam running the script prep_mmseqs_dbs.sh. I
ve done the corrections in script changing tar2exprofiledb to tsv2exprofiledb.
but the script extract the files and return the following error:
uniclust30_2018_08/uniclust30_2018_08_a3m.ffdata
uniclust30_2018_08/uniclust30_2018_08_a3m.ffindex
uniclust30_2018_08/uniclust30_2018_08_hhm.ffdata
uniclust30_2018_08/uniclust30_2018_08_hhm.ffindex
uniclust30_2018_08/uniclust30_2018_08_cs219.ffdata
uniclust30_2018_08/uniclust30_2018_08_cs219.ffindex
uniclust30_2018_08/uniclust30_2018_08.cs219
uniclust30_2018_08/uniclust30_2018_08.cs219.sizes
uniclust30_2018_08/uniclust30_2018_08_a3m_db
uniclust30_2018_08/uniclust30_2018_08_a3m_db.index
uniclust30_2018_08/uniclust30_2018_08_hhm_db
uniclust30_2018_08/uniclust30_2018_08_hhm_db.index
uniclust30_2018_08/uniclust30_2018_08_md5sum
../../scripts/prep_mmseqs_dbs.sh: line 33: mmseqs: command not found
I got mmseqs installed.
Could anyone help me?
when i use strategy='ddp', train_openfold.py error, follows:
RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons: 1) Use of a module parameter outside the forward
function. Please make sure model parameters are not shared across multiple concurrent forward-backward passes. or try to use _set_static_graph() as a workaround if this module graph does not change during training loop.2) Reused parameters in multiple reentrant backward passes. For example, if you use multiple checkpoint
functions to wrap the same part of your model, it would result in the same set of parameters been used by different reentrant backward passes multiple times, and hence marking a variable ready multiple times. DDP does not support such use cases in default. You can try to use _set_static_graph() as a workaround if your module graph does not change over iterations.
Parameter at index 4983 has been marked as ready twice. This means that multiple autograd engine hooks have fired for this particular parameter during this iteration. You can set the environment variable TORCH_DISTRIBUTED_DEBUG to either INFO or DETAIL to print parameter names for further debugging.
As it stands, only the attention primitives Attention
and GlobalAttention
are TorchScript-ed (or, for that matter, TorchScript-able) during inference. For better runtimes and memory allocation, more of the network's modules---especially in the Evoformer---should be made compatible with TorchScript. In my estimation, the biggest hurdle before this goal is the inference-time chunking functionality, which currently makes heavy use of function pointers not supported by TorchScript.
Thanks for such a great repo! I get the following issue when running the model (but only when I use GPUs). I'm using torch checkpointing, not deepspeed. I saw an issue similar to this, but it seemed to be deepspeed-specific so I thought I'd repost.
RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons: 1) Use of a module parameter outside the forward
function. Please make sure model parameters are not shared across multiple concurrent forward-backward passes. or try to use _set_static_graph() as a workaround if this module graph does not change during training loop.2) Reused parameters in multiple reentrant backward passes. For example, if you use multiple checkpoint
functions to wrap the same part of your model, it would result in the same set of parameters been used by different reentrant backward passes multiple times, and hence marking a variable ready multiple times. DDP does not support such use cases in default. You can try to use _set_static_graph() as a workaround if your module graph does not change over iterations.
Parameter at index 2000 with name module.model.evoformer.blocks.19.pair_transition.linear_2.bias [For reference I have 20 blocks ] has been marked as ready twice. This means that multiple autograd engine hooks have fired for this particular parameter during this iteration.
Hi,
Thanks for this great work.
Just wondering, is there a way to do complex (multimer) prediction as in alphafold multimer ?
Thanks
Oz
I've implemented low-memory attention (9670958) using an algorithm from a recent preprint (https://arxiv.org/pdf/2112.05682.pdf), enhanced a little bit with the ability to add multiple biases + batch dimensions. Lacking the JAX map & scan used in the original implementation, which I've had to replace with for loops, ours is quite a bit slower (exact figures depend heavily on the choice of chunk sizes, but it seems to be in the ballpark of 2x slower than our own standard Attention implementation). It would be nice to speed it up a little.
Thank you for sharing your code!
I am trying to train openfold, but the problem of loss being NAN persists, and the whole training hangs when this problem occurs.
I downloaded the code in early December and trained on 8 V100 cards with a training dataset size of 1000. When I ran to the 26th sample of the 2nd epoch, there were many warning outputs with a loss of NAN and the training was interrupted.
I read your solution of "Replace training_step
in train_openfold.py
" in Issue #19, after changing, when I train the first sample, I got this:
WARNING:root:loss is NaN. Returning 0 loss...
Training still hangs.
I ran a recent commit again and retrained with the same dataset and the same problem occurred again and on the same sample. Like this:
I changed the way the mapping is generated in data_modules.py
so that the dataset can be loaded in a fixed order when it is loaded, and I checked the data where the loss is NAN and found no abnormalities.
This is very strange, because with your first version of the code, there is no NAN loss so far, but with the version you committed after December this problem keeps occurring, even if I change my training dataset and the learning rate in the deepspeed config file, it does not improve the situation.
Is there a workaround for this situation?
[EDIT: I can see 'plddt' is part of the output, closing issue, will reopen if it's not the per-residue confidence score]
Thank you for this amazing repo!
Is there any suggested way to output the per-residue confidence score that AlphaFold produces?
When I use openfold to infer proteins, some of them can be inferred, but some of them will report errors. The reason for the error is probably: the template name searched out is outdated, and there is no outdated protein cif file in the template_mmcif_dir directory.
Below is an example of an error protein :
5IZB_A
Traceback (most recent call last):
File "run_pretrained_openfold.py", line 253, in
main(args)
File "run_pretrained_openfold.py", line 118, in main
fasta_path=fasta_path, alignment_dir=local_alignment_dir
File "/home/jsr/openfold/openfold/data/data_pipeline.py", line 420, in process_fasta
self.template_featurizer,
File "/home/jsr/openfold/openfold/data/data_pipeline.py", line 55, in make_template_features
hits=hits_cat,
File "/home/jsr/openfold/openfold/data/templates.py", line 1059, in get_templates
kalign_binary_path=self._kalign_binary_path,
File "/home/jsr/openfold/openfold/data/templates.py", line 827, in _process_single_hit
with open(cif_path, "r") as cif_file:
FileNotFoundError: [Errno 2] No such file or directory: '/public/database/alphafold2_database/mmcif/mmcif_files/4zai.cif'
I ran "precompute_alignments.py" to precompute 184,700 protein alignments before training the model because I want to use the same data as AlphaFold, but It took me ~4h to finish only one protein alignment (1yxq), so I want to know my operation is correct or not, besides, is there any precomputed alignments can be download to save my aligned time?
From downloading DeepMind's pretrained parameters, there are 5 models and for each model there is a .npz file and a _ptm.npz file. May I know what the 5 different models are and what the corresponding _ptm.npz files mean?
New issue based on: #34
Turning on bfloat16 in deepspeed doesn't seem to have the desired effect. Model params size remains unchanged. Hitting OOM in validation which works fine in FP16.
Training with bfloat16 in pytorch-lightning fails:
File "openfold/openfold/utils/loss.py", line 46, in sigmoid_cross_entropy
log_p = torch.nn.functional.logsigmoid(logits)
RuntimeError: "log_sigmoid_forward_cuda" not implemented for 'BFloat16'
Support still missing in deepspeed? microsoft/DeepSpeed#974
Tested on A100 with torch 1.10.1+cu113
Hi, I've processed some data for training but get the bug of dataloader of:
File "/share/home/openfold/openfold-main/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/utilities/apply_func.py", line 92, in apply_to_collection return function(data, *args, **kwargs) File "/share/home/openfold/openfold-main/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in __next__ data = self._next_data() File "/share/home/openfold/openfold-main/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data return self._process_data(data) File "/share/home/openfold/openfold-main/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data data.reraise() File "/share/home/openfold/openfold-main/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/_utils.py", line 434, in reraise raise exception TypeError: Caught TypeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/share/home/openfold/openfold-main/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "/share/home/openfold/openfold-main/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch return self.collate_fn(data) File "/share/home/openfold/openfold-main/lib/conda/envs/openfold_venv/lib/python3.7/site-packages/pytorch_lightning/utilities/auto_restart.py", line 474, in _capture_metadata_collate data = default_collate(samples) File "/share/home/openfold/openfold-main/openfold/data/data_modules.py", line 297, in __call__ prot, self.stage File "/share/home/openfold/openfold-main/openfold/data/feature_pipeline.py", line 116, in process_features mode=mode, File "/share/home/openfold/openfold-main/openfold/data/feature_pipeline.py", line 93, in np_example_to_features cfg[mode], File "/share/home/openfold/openfold-main/openfold/data/input_pipeline.py", line 187, in process_tensors_from_config lambda x: wrap_ensemble_fn(tensors, x), torch.arange(num_recycling + 1) File "/share/home/openfold/openfold-main/openfold/data/input_pipeline.py", line 201, in map_fn ensembles = [fun(elem) for elem in x] File "/share/home/openfold/openfold-main/openfold/data/input_pipeline.py", line 201, in <listcomp> ensembles = [fun(elem) for elem in x] File "/share/home/openfold/openfold-main/openfold/data/input_pipeline.py", line 187, in <lambda> lambda x: wrap_ensemble_fn(tensors, x), torch.arange(num_recycling + 1) File "/share/home/openfold/openfold-main/openfold/data/input_pipeline.py", line 168, in wrap_ensemble_fn return fn(d) File "/share/home/openfold/openfold-main/openfold/data/data_transforms.py", line 76, in <lambda> return lambda x: f(x, *args, **kwargs) File "/share/home/openfold/openfold-main/openfold/data/input_pipeline.py", line 196, in compose x = f(x) File "/share/home/openfold/openfold-main/openfold/data/data_transforms.py", line 76, in <lambda> return lambda x: f(x, *args, **kwargs) File "/share/home/openfold/openfold-main/openfold/data/data_transforms.py", line 180, in sample_msa num_seq = protein["msa"].shape[0] TypeError: 'function' object is not subscriptable
And I upload one of the datasample, Am I wrong with the generate MSAs pipeline or wrong with the dataloader?
5E0Y.zip
First of all, great work!
As you know for each protein Sequence Evolutionary Scale Modeling(ESM) generates an embedding in the size of #aminoacids*1280, I was wondering if we could get such information from openfold as well. do you think is this possible to extract such an embedding from the inner layers of openfold?
could you give some guide on how to extract such information from openfold?
Thanks!
When I run
scripts/install_third_party_dependencies.sh
it fails at the last step
gzip: tests/test_data/sample_feats.pickle.gz: No such file or directory
This is because the file sample_feats.pickle.gz is not downloaded.
Hi all! Firstly, thanks for your work and effort! I noticed that in the config file, the weight for each loss is different than that in af2's paper. For example, the weight for angle loss is 1, instead of 0.3. Some of them, such as the violation loss, experimentally solved loss, have a weight of 0. Is there any reason that the weight is set up this way? For instance, for losses that have been assigned a weight of 0 in the implementation, are they still under testing? Thanks!
Traceback (most recent call last):
File "/ocean/projects/bio210060p/kadyan/openfold-release/scripts/precompute_te
mplate_hits.py", line 224, in <module>
main(args, template_pipeline_runner)
File "/ocean/projects/bio210060p/kadyan/openfold-release/scripts/precompute_te
mplate_hits.py", line 116, in main
feature_dict = template_pipeline_runner.run(a3m_dir, fasta_file_path)
File "/ocean/projects/bio210060p/kadyan/openfold-release/scripts/precompute_te
mplate_hits.py", line 80, in run
alignment_dir=a3m_dir,
File "/ocean/projects/bio210060p/kadyan/openfold-release/openfold/data/data_pi
peline.py", line 360, in process_fasta
hits=hits_cat,
File "/ocean/projects/bio210060p/kadyan/openfold-release/openfold/data/templat
es.py", line 1058, in get_templates
kalign_binary_path=self._kalign_binary_path,
File "/ocean/projects/bio210060p/kadyan/openfold-release/openfold/data/templat
es.py", line 828, in _process_single_hit
with open(cif_path, "r") as cif_file:
FileNotFoundError: [Errno 2] No such file or directory: '/databases/pdb_mmcif/mmcif_files/6ek0.cif'
ISSUE: New entries added in obsolete.dat will fail because the corresponding replacements will not be found in the pre-downloaded pdb_mmcifs.
I get CUDA OOM error when I add my validation set, which I can predict just fine with run_pretrained_openfold.py
Are you limiting your validation set to a certain size? I assume the problem is because of the additional features necessary to compute the loss.
I had to do some changes to make validation work:
val needs to be changed to eval in data_modules, e.g.:
https://github.com/aqlaboratory/openfold/blob/main/openfold/data/data_modules.py#L153
The third argument "unclamped" no longer exists:
https://github.com/aqlaboratory/openfold/blob/main/openfold/data/data_modules.py#L188
Switch validation also to _output_raw=True
In the supplement of alphafold2 1.2.5, there are some filters, which are applied to the training data, does the latest code not include this part?
I was having issues with the prep_mmseqs_db.sh script, so I tried running the steps individually and I'm having an issue with running mmseqs tsv2exprofiledb
with the colabfold_envdb_202108
database.
First, I downloaded this databases using the download_mmseqs_dbs.sh
script and then ran the tar
command according to the example in prep_mmseqs_dbs.sh
such that I had a directory with the following files:
colabfold_envdb_202108.tsv
colabfold_envdb_202108_seq.tsv
colabfold_envdb_202108_aln.tsv
colabfold_envdb_202108_h.tsv
uniref30_2103.md5sums
uniref30_2103.tsv
uniref30_2103_h.tsv
uniref30_2103_aln.tsv
uniref30_2103_seq.tsv
I then used mmseqs tsv2exprofiledb mmseqs_dbs/uniref30_2103 /mmseqs/uniref30_2103_db
which seemed to complete without error (though there is no .idx file, which is supposed to be the output of this command, I believe), generating the following files:
uniref30_2103_db.dbtype
uniref30_2103_db_seq_tmp
uniref30_2103_db.index
uniref30_2103_db_seq_tmp.index.0
uniref30_2103_db.sh
uniref30_2103_db_h
uniref30_2103_db.0
uniref30_2103_db.1
uniref30_2103_db_h.dbtype
uniref30_2103_db_h.index
However, when I tried to do the same with the colabfold_envdb_202108 database, it seemed to start correctly, but then was killed
after a minute or two. The following files were generated:
colabfold_envdb_202108_db.sh
colabfold_envdb_202108_db_h
colabfold_envdb_202108_db_h.index.0
I used nohup
and this is the extent of the output from that command:
tsv2exprofiledb /mmseqs_dbs/colabfold_envdb_202108 /mmseqs_dbs/colabfold_envdb_202108_db
MMseqs Version: 4f046dd1979ec87b440656ff13b12e5c525b8374
Verbosity 3
Killed
I'm wondering if I'm using an instance with insufficient RAM. Do you have an idea of the amount of RAM needed for the idx files?
Great work with reproducing the original code and creating a OpenSource PyTorch Implementation ☕️☕️☕️☕️
When I try to run the attached Colab Notebook, In the "Search against genetic databases" subsection while importing datapipeline
from openfold.data
, I run into a ImportError, viz.
ImportError: cannot import name 'MultipleChainsError' from 'openfold.data.templates' (/opt/conda/lib/python3.7/site-packages/openfold/data/templates.py)
The full traceback is attached below :-
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-7-8051d602620b> in <module>()
27 from openfold.data import feature_pipeline
28 from openfold.data import parsers
---> 29 from openfold.data import data_pipeline
30 from openfold.data.tools import jackhmmer
31 from openfold.model import model
/opt/conda/lib/python3.7/site-packages/openfold/data/data_pipeline.py in <module>()
20 import numpy as np
21
---> 22 from openfold.data import templates, parsers, mmcif_parsing
23 from openfold.data.tools import jackhmmer, hhblits, hhsearch
24 from openfold.data.tools.utils import to_date
/opt/conda/lib/python3.7/site-packages/openfold/data/templates.py in <module>()
26 import numpy as np
27
---> 28 from openfold.data import parsers, mmcif_parsing
29 from openfold.data.tools import kalign
30 from openfold.data.tools.utils import to_date
/opt/conda/lib/python3.7/site-packages/openfold/data/mmcif_parsing.py in <module>()
27 import numpy as np
28
---> 29 from openfold.data.templates import MultipleChainsError
30 import openfold.np.residue_constants as residue_constants
31
ImportError: cannot import name 'MultipleChainsError' from 'openfold.data.templates' (/opt/conda/lib/python3.7/site-packages/openfold/data/templates.py)
Interesting enough if I add from openfold.data.templates import MultipleChainsError
I run into a circular ImportError
, the error trace is attached below
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-8-71256580fa0c> in <module>()
27 from openfold.data import feature_pipeline
28 from openfold.data import parsers
---> 29 from openfold.data.templates import MultipleChainsError
30 from openfold.data import data_pipeline
31 from openfold.data.tools import jackhmmer
/opt/conda/lib/python3.7/site-packages/openfold/data/templates.py in <module>()
26 import numpy as np
27
---> 28 from openfold.data import parsers, mmcif_parsing
29 from openfold.data.tools import kalign
30 from openfold.data.tools.utils import to_date
/opt/conda/lib/python3.7/site-packages/openfold/data/mmcif_parsing.py in <module>()
27 import numpy as np
28
---> 29 from openfold.data.templates import MultipleChainsError
30 import openfold.np.residue_constants as residue_constants
31
ImportError: cannot import name 'MultipleChainsError' from 'openfold.data.templates' (/opt/conda/lib/python3.7/site-packages/openfold/data/templates.py)
I noticed a small bug in the prep_mmseqs_dbs.sh
. This script fails due to the lack of the mmseqs_dbs
directory. I made a branch to try to make a pull request but I got an error saying permission was denied. I also updated the readme to fix the instruction for running this script (download_mmseqs_databases.sh
-> download_mmseqs_dbs.sh
, prep_mmseqs_databases.sh
-> prep_mmseqs_dbs.sh
). Here are the changed I propose to prep_mmseqs_dbs.sh
:
#!/bin/bash
#
# Copyright 2021 AlQuraishi Laboratory
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Downloads and unzips all required data for AlphaFold.
#
# Usage: bash download_all_data.sh /path/to/download/directory
set -e
DOWNLOAD_DIR="$1"
ROOT_DIR="${DOWNLOAD_DIR}/mmseqs_dbs"
mkdir --parents "${ROOT_DIR}"
for f in $(ls ${DOWNLOAD_DIR}/*.tar.gz)
do
tar --extract --verbose --file="${f}" \
--directory="${ROOT_DIR}"
rm "${f}"
BASENAME="$(basename {f%%.*})"
DB_NAME="${BASENAME}_db"
OLD_PWD=$(pwd)
cd "${ROOT_DIR}"
mmseqs tsv2exprofiledb "${BASENAME}" "${DB_NAME}"
mmseqs createindex "${DB_NAME}" "${DOWNLOAD_DIR}/tmp/"
cd "${OLD_PWD}"
done
I ran train_openfold.py, and when I reached the validation set, I got an ‘OpenfoldWrapper’ object without cached_weights attribute. Can you help me see what is wrong?
Traceback (most recent call last):
File "train_openfold.py", line 370, in
main(args)
File "train_openfold.py", line 233, in main
ckpt_path=ckpt_path,
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 739, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 683, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 773, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1195, in _run
self._dispatch()
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1275, in _dispatch
self.training_type_plugin.start_training(self)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1285, in run_stage
return self._run_train()
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1315, in _run_train
self.fit_loop.run()
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
self.epoch_loop.run(data_fetcher)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 146, in run
self.on_advance_end()
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 242, in on_advance_end
self._run_validation()
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 337, in _run_validation
self.val_loop.run()
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 110, in advance
dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 122, in advance
output = self._evaluation_step(batch, batch_idx, dataloader_idx)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 217, in _evaluation_step
output = self.trainer.accelerator.validation_step(step_kwargs)
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 236, in validation_step
return self.training_type_plugin.validation_step(*step_kwargs.values())
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 219, in validation_step
return self.model.validation_step(*args, **kwargs)
File "train_openfold.py", line 108, in validation_step
if(self.cached_weights is None):
File "/public/tools/anaconda3/envs/openfold/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1178, in getattr
type(self).name, name))
AttributeError: 'OpenFoldWrapper' object has no attribute 'cached_weights'
Nice work! Can openfold achieve AF2 performance?
Hi,
Thanks for the great effort. I was looking at
openfold/openfold/utils/affine_utils.py
Line 282 in a933bc7
I wonder this line should be
translation = -1 * ca_xyz
rather than translation = -1 * c_xyz
?
Currently, the fape loss is clamped in 90% of the cases during validation. I'm wondering if this should be made deterministic (always clamp or never clamp) to make validation runs more comparable.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.