kosinskilab / alphapulldown Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://doi.org/10.1093/bioinformatics/btac749
License: GNU General Public License v3.0
Home Page: https://doi.org/10.1093/bioinformatics/btac749
License: GNU General Public License v3.0
which data dir on our cluster (your copy without symlinks) should be used for the singularity option?
This is really for later and to discuss first:
Sometimes we do want to re-calculate MSA and templates for a chopped fragment, so create_features should have an option to take prot_A,start-end as input and create features that preserve numbering.
Hi,
I was trying to use one protein against 200 protein features in second step and got this error when I running it on one gpu. Can have have some suggestions about this error?
Thanks!
Ning
I0915 16:06:40.291960 46912496410304 run_multimer_jobs.py:158] done creating multimer AcrIF1_and_CAADJO010000003.1_2775
I0915 16:06:40.625929 46912496410304 run_multimer_jobs.py:158] done creating multimer AcrIF1_and_CAADJO010000003.1_278
I0915 16:06:40.790300 46912496410304 run_multimer_jobs.py:158] done creating multimer AcrIF1_and_CAADJO010000003.1_2851
I0915 16:06:40.859936 46912496410304 run_multimer_jobs.py:158] done creating multimer AcrIF1_and_CAADJO010000003.1_2994
I0915 16:06:41.130373 46912496410304 xla_bridge.py:264] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
I0915 16:06:41.459139 46912496410304 xla_bridge.py:264] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
I0915 16:06:46.201552 46912496410304 run_multimer_jobs.py:236] now running prediction on AcrIF1_and_AAQW01000001.1_2315
I0915 16:06:46.201746 46912496410304 predict_structure.py:40] Checking for /data/duann2/virus_proj/alphapd/pd_parallel/pd_step2/output/models/AcrIF1_and_AAQW01000001.1_2315/ranking_debug.json
I0915 16:06:46.202050 46912496410304 predict_structure.py:48] Running model model_1_multimer_v2_pred_0 on AcrIF1_and_AAQW01000001.1_2315
I0915 16:06:46.202407 46912496410304 model.py:166] Running predict with shape(feat) = {'aatype': (610,), 'residue_index': (610,), 'seq_length': (), 'msa': (2055, 610), 'num_alignments': (), 'template_aatype': (4, 610), 'template_all_atom_mask': (4, 610, 37), 'template_all_atom_positions': (4, 610, 37, 3), 'asym_id': (610,), 'sym_id': (610,), 'entity_id': (610,), 'deletion_matrix': (2055, 610), 'deletion_mean': (610,), 'all_atom_mask': (610, 37), 'all_atom_positions': (610, 37, 3), 'assembly_num_chains': (), 'entity_mask': (610,), 'num_templates': (), 'cluster_bias_mask': (2055,), 'bert_mask': (2055, 610), 'seq_mask': (610,), 'msa_mask': (2055, 610)}
2022-09-15 16:06:46.814915: W external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Running ptxas --version returned 32512
2022-09-15 16:06:46.931611: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:460] ptxas returned an error during compilation of ptx to sass: 'INTERNAL: ptxas exited with non-zero error code 32512, output: ' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided.
Fatal Python error: Aborted
Thread 0x00002aaaaaaf1ec0 (most recent call first):
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 648 in backend_compile
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/profiler.py", line 206 in wrapper
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 703 in compile_or_get_cached
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 735 in from_xla_computation
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 640 in compile
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 198 in _xla_callable_uncached
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 116 in xla_primitive_callable
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/util.py", line 212 in cached
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/util.py", line 219 in wrapper
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 97 in apply_primitive
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/core.py", line 678 in process_primitive
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/core.py", line 328 in bind_with_trace
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/core.py", line 325 in bind
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/lax/lax.py", line 444 in shift_right_logical
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/prng.py", line 272 in threefry_seed
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/prng.py", line 232 in seed_with_impl
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/random.py", line 125 in PRNGKey
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/model/model.py", line 167 in predict
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/alphapulldown/predict_structure.py", line 58 in predict
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 249 in predict_individual_jobs
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 275 in predict_multimers
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 324 in main
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 258 in _run_main
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 312 in run
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 328 in
/var/spool/slurm/slurmd/job47738660/slurm_script: line 16: 40140 Aborted run_multimer_jobs.py --mode=pulldown --num_cycle=3 --num_predictions_per_model=1 --output_path=/data/duann2/virus_proj/alphapd/pd_parallel/pd_step2/output/models --data_dir=/data/duann2/virus_proj/alphapd/db_output/ --protein_lists=baits.txt,candidates.txt --monomer_objects_dir=/data/duann2/virus_proj/alphapd/pd_parallel/pd_output/
with sth like:
try:
run_af = load_module(PATH_TO_RUN_ALPHAFOLD, "run_alphafold")
except FileNotFoundError:
#try to find in the upper directory as in the normal AlphaFold repo
PATH_TO_RUN_ALPHAFOLD = os.path.join(
os.path.dirname(os.path.dirname(alphafold.__file__)), "run_alphafold.py"
)
run_af = load_module(PATH_TO_RUN_ALPHAFOLD, "run_alphafold")
to let hard core users like me to overwrite alphafold with the original, possibly modified repo
AlphaPulldown/alphapulldown/objects.py
Line 168 in 4703604
Add a warning like:
"To not overload the remote server, do not submit a large number of jobs at the same time. If you want to calculate MSAs for many sequences, use Option 2 below."
AlphaPulldown/mmseqs2_manual.md
Line 3 in 4be1ecd
It is very difficult at the moment to track the logs. As things often crash (memory issues, AlphaFold errors, our errors), I find myself putting a lot of work to associate the jobs with log files. Could we have optional arguments to redirect the logging to a log file saved to the output directory, for both scripts? This is how to set the logging to the file:
https://www.delftstack.com/howto/python/python-log-to-file/
I was wondering how to use the high throughput modeling of higher-order oligomers?
is this part of the custom mode?
It reads like it would be possible to circumvent the AF-multimer size limitation by using AlphaPulldown, just not sure if I misunderstood it.
Thx!
The part starting from "Check if you have downloaded necessary parameters and databases" is the same in all examples - move it to the main manual in the Install database part that I have just added
Is the MMSeqs2 mode using --max_template_date ?
In principle, you could save msa_lines as a3m or sto after this line
AlphaPulldown/alphapulldown/objects.py
Line 200 in 6f5b7d0
and the run local hmmer run like in the original AlphaFold:
https://github.com/deepmind/alphafold/blob/5cb2f8c480aa8314c02a93c6fbfc3f48f0ce8af0/alphafold/data/pipeline.py#L179
Or just use ColabFold hhsearch using mk_template?:
https://github.com/sokrypton/ColabFold/blob/8771fa10ce233e02efe0191ea5fb83ce3e1ca5f8/colabfold/batch.py#L149
just using the full PDB70 database from AlphaFold?
I think 5 is too conservative by default.
Such as amber_relaxing, which is copied and pasted twice I think.
You write:
Please be aware that everything after > will be taken as the description of the protein and make sure do NOT include any special symbol, such as |, after >
But haven't you implemented a function that deals with that? Perhaps it should now say then:
Please be aware that everything after > will be taken as the description of the protein and any special symbol, such as | will be replaced with underscores in the resulting files.
Hello — when running through the first step of the example 1 md notebook inside a SLURM array, I'm running into the following issue:
Traceback (most recent call last):
File "/scratch/user/AlphaPulldown1/bin/create_individual_features.py", line 247, in <module>
app.run(main)
File "/scratch/user/AlphaPulldown1/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/scratch/user/AlphaPulldown1/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/scratch/user/AlphaPulldown1/bin/create_individual_features.py", line 240, in main
create_and_save_monomer_objects(curr_monomer, pipeline, flags_dict)
File "/scratch/user/AlphaPulldown1/bin/create_individual_features.py", line 204, in create_and_save_monomer_objects
save_msa=FLAGS.save_msa_files,
File "/scratch/user/AlphaPulldown1/lib/python3.7/site-packages/alphapulldown/objects.py", line 135, in make_features
save_msa=False,use_precomuted_msa=False)
File "/scratch/user/AlphaPulldown1/lib/python3.7/site-packages/alphapulldown/objects.py", line 116, in execute_pipeline
use_precomuted_msa=use_precomuted_msa,
TypeError: all_seq_msa_features() got an unexpected keyword argument 'msa_output_dir'
The following command that I'm using inside my bash script is as follows:
[create_individual_features.py --fasta_paths=/scratch/user/AlphaPulldownTest/baits.fasta,\
/scratch/user/AlphaPulldownTest/sequences_shorter.fasta --data_dir=/vast/user/public/alphafold --save_msa_files=False \
--output_dir=/scratch/user/AlphaPulldownMSAOut --use_precomputed_msas=False --max_template_date=2050-01-01 --skip_existing=False \
--seq_index=$SLURM_ARRAY_TASK_ID]
I've tried a couple of configs for the settings, although have gotten no success. A little unsure what is meant by "msa_output_dir" in the error message. Many thanks beforehand!
Change this to:
Path(path).mkdir(parents=True, exist_ok=True) and consider deleting pi_score_outputs before (as pi_score creates then new pi_score files, with new dates, and not sure if we control what would happen if multiple files are in the pi_score_outputs.
Hi,
Is it possible to use job arrays, but assign more than one prediction per job? I have ~400 interacting pairs and I would like to be able to request an array of 10 nodes to each predict complexes for 40 of them. If this doesn't already exist and you don't mind, I could implement the feature myself and make a pull request.
Best,
Sebastian
Shall we rename it and remove any Hyde mentions as that was a library for the Hyde cluster? Clean remove all slurm things? Let him look at this file
[email protected]:/g/kosinski/kosinski/devel/AlphaPulldown$ singularity exec --no-home --bind /scratch/kosinski/testAlphaPulldown_models:/mnt \
> /g/kosinski/kosinski/devel/AlphaPulldown/alpha-analysis.sif run_get_good_pae.sh --output_dir=/mnt --cutoff=5 --create_notebook=True
I0708 21:38:20.643415 139846045984576 get_good_inter_pae.py:135] now processing O43432_and_P09132
Traceback (most recent call last):
File "/app/programme_notebook/get_good_inter_pae.py", line 171, in <module>
app.run(main)
File "/opt/conda/envs/programme_notebook/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/opt/conda/envs/programme_notebook/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/app/programme_notebook/get_good_inter_pae.py", line 141, in main
seqs = pickle.load(open(result_path,'rb'))['seqs']
ModuleNotFoundError: No module named 'jax'
Hi,
Is the model preset flag used when running alphafold multimer? I would like to be able to change the model weights, but as far as I can tell the argument is ignored. It seems like the multimer model is used whenever predicting the structure of a MultimericObject
and otherwise the monomer_ptm model is used. Would it be possible to allow the other model configurations? I'm particularly interested in using monomer_ptm weights to predict complexes.
Here:
AlphaPulldown/alphapulldown/objects.py
Line 245 in 4703604
Move lines 242-246 to if a3m_lines is not None: block?
Hi,
nice application of AF2. For me, conda installation did not work, so I would like to try an other virtual environment. Could you please provide a requirements.txt file for installation?
Thank you very much!
Just to consider in the future:
Many users might want to use ColabFold for MSAs to be quicker. Shall we provide instructions how to do it?
Just to get the total number for slurm array script
Hi,
I'm running run_multimer_jobs.py in pulldown mode with one bait protein and 100 candidate proteins. The bait is 300 amino acids and candidates are each 150 amino acids. I've been getting RuntimeError: INTERNAL: Failed to load in-memory CUBIN: CUDA_ERROR_OUT_OF_MEMORY: out of memory
every time I run the script after about 6-12 bait-candidate pairs. I can manually delete the completed runs and resume, but it's time-consuming to have to watch the script. Is there a memory leak somewhere that can be remedied?
I have an RTX 3090 with 24GB of GPU RAM.
Command:
singularity exec --no-home --bind /scratch/kosinski/Giardia/interactome/Group1/homooligomers:/mnt
/g/kosinski/kosinski/devel/AlphaPulldown/alpha-analysis.sif run_get_good_pae.sh --output_dir=/mnt --cutoff=5
Error:
Traceback (most recent call last):
File "/app/programme_notebook/get_good_inter_pae.py", line 136, in <module>
app.run(main)
File "/opt/conda/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/opt/conda/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/app/programme_notebook/get_good_inter_pae.py", line 110, in main
iptm_ptm_score = json.load(open(os.path.join(result_subdir,"ranking_debug.json"),'rb'))['iptm+ptm'][best_model]
KeyError: 'iptm+ptm'
Nothing happens after this line?
AlphaPulldown/alphapulldown/objects.py
Line 189 in 6f5b7d0
In:
AlphaPulldown/mmseqs2_manual.md
Line 78 in 360a391
So what is the logic with --mseqs2 now? Is this correct?
Regardless of --use_precomputed_msas option
it will check if a3m exists
and if it doesn't
it will run remote mmseqs2
if it does
it will take local a3m
So --use_precomputed_msas has no effect for this mode, should it be somehow blocked then? Or when --use_precomputed_msas=False - always run remote (requiring users to set --use_precomputed_msas=True if local a3ms are to be used?). The latter seems most logical to me.
Hi,
I'm currently running the example and everything seems to be going smoothly. I'm interested in using this in pulldown mode, where I have a natural protein and a list of peptides which are designed or otherwise synthetic. As the peptides have no homolog there is little reason to spend time generating MSAs for each. Would it be possible to control for which proteins the HMM search is performed during feature generation? I'm not sure whether it would be easier to create dummy MSA files (which are empty) or to just modify the run_multimer_jobs.py
to handle the case where a peptide has no MSA, but either would work fine for me.
Respectfully,
Sebastian
Hi,
I'd like to test out some changes and additional arguments to the scripts in this repo. I made a fork of the repo, made some changes to the code, and then tried to install the repo as a package so I could verify that the changes worked. I tried making a very simple setup.py
file and then installing it as package into my conda environment with pip install -e .
. I tried testing the package by running the unaltered code, but I got an odd error (will attach for reference at the bottom of this message), which I suspect is due to an alphafold version mismatch.
What is the easiest way for me to install my forked repo in the same way as the official AlphaPulldown package? Is there a setup.py or other file that I could use to ensure that it's installed in the same way?
In case it's informative, here's the error
File "/home/gridsan/sswanson/miniconda3/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 7, in <module>
exec(compile(f.read(), __file__, 'exec'))
File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/run_multimer_jobs.py", line 333, in <module>
app.run(main)
File "/home/gridsan/sswanson/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/home/gridsan/sswanson/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/run_multimer_jobs.py", line 329, in main
predict_multimers(multimers)
File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/run_multimer_jobs.py", line 279, in predict_multimers
random_seed=random_seed,
File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/run_multimer_jobs.py", line 252, in predict_individual_jobs
seqs=multimer_object.input_seqs,
File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/predict_structure.py", line 62, in predict
prediction_result.update({"seqs": seqs})
AttributeError: 'tuple' object has no attribute 'update'```
When I use the notebook and execute cells, the output (PAE plots, 3d renderings) are saved in the notebook automatically. If the notebook has many entries, it becomes unusable at some point. Is it possible in jupyter, dunno, by adding some directive on top of the notebook, to prevent auto-saving of output?
Otherwise difficult to trace crashed jobs from the log files...
These two modules are not compatible in my tests:
module load HMMER/3.1b2-foss-2016b
module load HH-suite/3.3.0-gompic-2020b
Have you tested and works for you? I replaced gompic with foss and runs but with some warnings.
Have you tested data_dir is not needed? Isn't data_dir used to locate hhsearch database for template search?
AlphaPulldown/mmseqs2_manual.md
Line 24 in 4be1ecd
Can you create this folder:
https://github.com/henrywotton/AlphaPulldown/blob/37c2b1c2b25ded268e37f6ab11e418fd7ecbb7cf/alphapulldown/objects.py#L134
if does not exist, after this line? Otherwise the pipeline does not work properly if I set use_existing_msas=True but the folder doesn't exist yet (e.g. when re-running an array with some jobs partially crashed). With use_existing_msas but MSAs absent, AlphaFold would still run and generate the missing MSAs, but here it cannot because the parent folder does not exist.
Missing "s".
In the pip version of alphapulldown Path(msa_output_dir).mkdir(parents=True, exist_ok=True) is not present in make_features ... (ba913a1).
Hi,
Thanks for your help! I was successfully running the pipeline with pulldown mode followed the example1 with one bait sequence against 30,000 candidates sequences. There are multiple pdb file as output. I am wondering if all these pdb are the structure of interaction between the bait and candidates? No pdb is only for the bait or candidate? Do I need to run alphafold again to get the structure for each candidates?
Thanks,
Ning
Add a command-line option to zip the output MSAs in create_individual_features.py if --save_msa_files is True
It worked for a couple of jobs but for most it crashes with:
Traceback (most recent call last):
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/bin/run_multimer_jobs.py", line 329, in <module>
app.run(main)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/bin/run_multimer_jobs.py", line 325, in main
predict_multimers(multimers)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/bin/run_multimer_jobs.py", line 289, in predict_multimers
random_seed=random_seed,
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/bin/run_multimer_jobs.py", line 250, in predict_individual_jobs
create_and_save_pae_plots(multimer_object, output_path)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/alphapulldown/utils.py", line 182, in create_and_save_pae_plots
multimer_object.input_seqs, order, output_dir, multimer_object.description
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/alphapulldown/plot_pae.py", line 35, in plot_pae
fig, ax1 = plt.subplots(1, 1)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/cbook/deprecation.py", line 451, in wrapper
return func(*args, **kwargs)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/pyplot.py", line 1287, in subplots
fig = figure(**fig_kw)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/pyplot.py", line 693, in figure
**kwargs)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/pyplot.py", line 315, in new_figure_manager
return _backend_mod.new_figure_manager(*args, **kwargs)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/backend_bases.py", line 3494, in new_figure_manager
return cls.new_figure_manager_given_figure(num, fig)
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/backends/_backend_tk.py", line 885, in new_figure_manager_given_figure
window = tk.Tk(className="matplotlib")
File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/tkinter/__init__.py", line 2020, in __init__
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: couldn't connect to display ":0"
Maybe you need to set matplotlib backend explicitly?
I think the translation example should be smaller, to let people test more easily. What about taking top 10 based on iPTM and bottom 5 from the google sheet list, and 5 from those without any good PAE?
Hi, I'm trying to run pulldown for the first time and I am unable to get the example to work. It seems like an issue with alphafold's jackhmmer script but I'm not sure how to debug. Thanks!
(AlphaPulldown) kduong@glycine:~/AlphaPulldown$ python3 alphapulldown/create_individual_features.py --fasta_paths=example_data/example_1_sequences_shorter.fasta --data_dir=/home/kduong/af2_databases/ --save_msa_files=False --output_dir=/home/kduong/pulldown_input_output/output/ --use_precomputed_msas=False --max_template_date=2050-01-01
2022-08-24 15:53:43.371134: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
I0824 15:53:45.712832 140448664971072 templates.py:857] Using precomputed obsolete pdbs /home/kduong/af2_databases/pdb_mmcif/obsolete.dat.
I0824 15:53:45.718152 140448664971072 objects.py:112] You have chosen not to save msa output files
Traceback (most recent call last):
File "alphapulldown/create_individual_features.py", line 247, in <module>
app.run(main)
File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "alphapulldown/create_individual_features.py", line 240, in main
create_and_save_monomer_objects(curr_monomer, pipeline, flags_dict)
File "alphapulldown/create_individual_features.py", line 204, in create_and_save_monomer_objects
save_msa=FLAGS.save_msa_files,
File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphapulldown/objects.py", line 118, in make_features
input_fasta_path=fasta_file, msa_output_dir=tmpdirname
File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/data/pipeline.py", line 169, in process
max_sto_sequences=self.uniref_max_hits)
File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/data/pipeline.py", line 94, in run_msa_tool
result = msa_runner.query(input_fasta_path, max_sto_sequences)[0] # pytype: disable=wrong-arg-count
File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/data/tools/jackhmmer.py", line 172, in query
input_fasta_path, self.database_path, max_sequences)
File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/data/tools/jackhmmer.py", line 133, in _query_chunk
logging.info('Launching subprocess "%s"', ' '.join(cmd))
TypeError: sequence item 0: expected str instance, NoneType found
Hi,
Can I allocate more CPUs to the first step of prediction? I can't find the argument to assign more CPUs. I have over 30,000 proteins that need to be predicted, it still needs to take days even though I have split it up into 200 sub-jobs. Can I have some suggestions?
Thanks,
Ning
I think it does not correlate with anything, is confusing and useless.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.