kosinskilab / alphapulldown Goto Github PK

View Code? Open in Web Editor NEW

192.0 192.0 46.0 507.96 MB

Home Page: https://doi.org/10.1093/bioinformatics/btac749

License: GNU General Public License v3.0

Python 98.08% Shell 0.19% Dockerfile 1.73%

alphapulldown's People

Contributors

Stargazers

Watchers

alphapulldown's Issues

singularity_user_guide.md is missing

Which database to use for singularity?

which data dir on our cluster (your copy without symlinks) should be used for the singularity option?

Give option to re-calculate features for a fragment

This is really for later and to discuss first:

Sometimes we do want to re-calculate MSA and templates for a chopped fragment, so create_features should have an option to take prot_A,start-end as input and create features that preserve numbering.

Unable to initialize backend 'tpu_driver'

Hi,

I was trying to use one protein against 200 protein features in second step and got this error when I running it on one gpu. Can have have some suggestions about this error?

Thanks!

Ning

I0915 16:06:40.291960 46912496410304 run_multimer_jobs.py:158] done creating multimer AcrIF1_and_CAADJO010000003.1_2775
I0915 16:06:40.625929 46912496410304 run_multimer_jobs.py:158] done creating multimer AcrIF1_and_CAADJO010000003.1_278
I0915 16:06:40.790300 46912496410304 run_multimer_jobs.py:158] done creating multimer AcrIF1_and_CAADJO010000003.1_2851
I0915 16:06:40.859936 46912496410304 run_multimer_jobs.py:158] done creating multimer AcrIF1_and_CAADJO010000003.1_2994
I0915 16:06:41.130373 46912496410304 xla_bridge.py:264] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
I0915 16:06:41.459139 46912496410304 xla_bridge.py:264] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
I0915 16:06:46.201552 46912496410304 run_multimer_jobs.py:236] now running prediction on AcrIF1_and_AAQW01000001.1_2315
I0915 16:06:46.201746 46912496410304 predict_structure.py:40] Checking for /data/duann2/virus_proj/alphapd/pd_parallel/pd_step2/output/models/AcrIF1_and_AAQW01000001.1_2315/ranking_debug.json
I0915 16:06:46.202050 46912496410304 predict_structure.py:48] Running model model_1_multimer_v2_pred_0 on AcrIF1_and_AAQW01000001.1_2315
I0915 16:06:46.202407 46912496410304 model.py:166] Running predict with shape(feat) = {'aatype': (610,), 'residue_index': (610,), 'seq_length': (), 'msa': (2055, 610), 'num_alignments': (), 'template_aatype': (4, 610), 'template_all_atom_mask': (4, 610, 37), 'template_all_atom_positions': (4, 610, 37, 3), 'asym_id': (610,), 'sym_id': (610,), 'entity_id': (610,), 'deletion_matrix': (2055, 610), 'deletion_mean': (610,), 'all_atom_mask': (610, 37), 'all_atom_positions': (610, 37, 3), 'assembly_num_chains': (), 'entity_mask': (610,), 'num_templates': (), 'cluster_bias_mask': (2055,), 'bert_mask': (2055, 610), 'seq_mask': (610,), 'msa_mask': (2055, 610)}
2022-09-15 16:06:46.814915: W external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Running ptxas --version returned 32512
2022-09-15 16:06:46.931611: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:460] ptxas returned an error during compilation of ptx to sass: 'INTERNAL: ptxas exited with non-zero error code 32512, output: ' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided.
Fatal Python error: Aborted

Thread 0x00002aaaaaaf1ec0 (most recent call first):
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 648 in backend_compile
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/profiler.py", line 206 in wrapper
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 703 in compile_or_get_cached
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 735 in from_xla_computation
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 640 in compile
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 198 in _xla_callable_uncached
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 116 in xla_primitive_callable
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/util.py", line 212 in cached
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/util.py", line 219 in wrapper
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/dispatch.py", line 97 in apply_primitive
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/core.py", line 678 in process_primitive
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/core.py", line 328 in bind_with_trace
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/core.py", line 325 in bind
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/lax/lax.py", line 444 in shift_right_logical
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/prng.py", line 272 in threefry_seed
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/prng.py", line 232 in seed_with_impl
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/jax/_src/random.py", line 125 in PRNGKey
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/model/model.py", line 167 in predict
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/alphapulldown/predict_structure.py", line 58 in predict
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 249 in predict_individual_jobs
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 275 in predict_multimers
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 324 in main
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 258 in _run_main
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 312 in run
File "/data/duann2/deeplearning/conda/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 328 in
/var/spool/slurm/slurmd/job47738660/slurm_script: line 16: 40140 Aborted run_multimer_jobs.py --mode=pulldown --num_cycle=3 --num_predictions_per_model=1 --output_path=/data/duann2/virus_proj/alphapd/pd_parallel/pd_step2/output/models --data_dir=/data/duann2/virus_proj/alphapd/db_output/ --protein_lists=baits.txt,candidates.txt --monomer_objects_dir=/data/duann2/virus_proj/alphapd/pd_parallel/pd_output/

Improve loading run_alphafold.py

Replace
https://github.com/alphapulldown-devs/AlphaPulldown/blob/f077980592547320e04e7cfff9076dd5f9ae5dbd/alphapulldown/create_individual_features.py#L48

with sth like:
try:
    run_af = load_module(PATH_TO_RUN_ALPHAFOLD, "run_alphafold")
except FileNotFoundError:
    #try to find in the upper directory as in the normal AlphaFold repo
    PATH_TO_RUN_ALPHAFOLD = os.path.join(
        os.path.dirname(os.path.dirname(alphafold.__file__)), "run_alphafold.py"
    )
    run_af = load_module(PATH_TO_RUN_ALPHAFOLD, "run_alphafold")

to let hard core users like me to overwrite alphafold with the original, possibly modified repo

A small typo ;-)

AlphaPulldown/alphapulldown/objects.py

Line 168 in 4703604

template_featureiser = templates.HhsearchHitFeaturizer(

Add a warning of not overloading the remote server

Add a warning like:
"To not overload the remote server, do not submit a large number of jobs at the same time. If you want to calculate MSAs for many sequences, use Option 2 below."

AlphaPulldown/mmseqs2_manual.md

Line 3 in 4be1ecd

# option 1: run mmseqs2 remotely

Optionally redirect logs to files

It is very difficult at the moment to track the logs. As things often crash (memory issues, AlphaFold errors, our errors), I find myself putting a lot of work to associate the jobs with log files. Could we have optional arguments to redirect the logging to a log file saved to the output directory, for both scripts? This is how to set the logging to the file:
https://www.delftstack.com/howto/python/python-log-to-file/

modeling of higher-order oligomers question

I was wondering how to use the high throughput modeling of higher-order oligomers?
is this part of the custom mode?
It reads like it would be possible to circumvent the AF-multimer size limitation by using AlphaPulldown, just not sure if I misunderstood it.
Thx!

Move the database part to the main manual

The part starting from "Check if you have downloaded necessary parameters and databases" is the same in all examples - move it to the main manual in the Install database part that I have just added

--max_template_date with MMSeqs2

Is the MMSeqs2 mode using --max_template_date ?

Adding templates for local mmseqs2 mode here?

In principle, you could save msa_lines as a3m or sto after this line

AlphaPulldown/alphapulldown/objects.py

Line 200 in 6f5b7d0

) = unserialize_msa(a3m_lines, self.sequence)

and the run local hmmer run like in the original AlphaFold:
https://github.com/deepmind/alphafold/blob/5cb2f8c480aa8314c02a93c6fbfc3f48f0ce8af0/alphafold/data/pipeline.py#L179

Or just use ColabFold hhsearch using mk_template?:
https://github.com/sokrypton/ColabFold/blob/8771fa10ce233e02efe0191ea5fb83ce3e1ca5f8/colabfold/batch.py#L149
just using the full PDB70 database from AlphaFold?

Set PAE threshold to 10 in the manuals

I think 5 is too conservative by default.

Remove redunancies from predict_structure.py

Such as amber_relaxing, which is copied and pasted twice I think.

The special character warning is obsolete?

You write:
Please be aware that everything after > will be taken as the description of the protein and make sure do NOT include any special symbol, such as |, after >
But haven't you implemented a function that deals with that? Perhaps it should now say then:
Please be aware that everything after > will be taken as the description of the protein and any special symbol, such as | will be replaced with underscores in the resulting files.

Error when running SLURM array job — TypeError: all_seq_msa_features() got an unexpected keyword argument 'msa_output_dir'

Hello — when running through the first step of the example 1 md notebook inside a SLURM array, I'm running into the following issue:

Traceback (most recent call last):
  File "/scratch/user/AlphaPulldown1/bin/create_individual_features.py", line 247, in <module>
    app.run(main)
  File "/scratch/user/AlphaPulldown1/lib/python3.7/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/scratch/user/AlphaPulldown1/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/scratch/user/AlphaPulldown1/bin/create_individual_features.py", line 240, in main
    create_and_save_monomer_objects(curr_monomer, pipeline, flags_dict)
  File "/scratch/user/AlphaPulldown1/bin/create_individual_features.py", line 204, in create_and_save_monomer_objects
    save_msa=FLAGS.save_msa_files,
  File "/scratch/user/AlphaPulldown1/lib/python3.7/site-packages/alphapulldown/objects.py", line 135, in make_features
    save_msa=False,use_precomuted_msa=False)
  File "/scratch/user/AlphaPulldown1/lib/python3.7/site-packages/alphapulldown/objects.py", line 116, in execute_pipeline
    use_precomuted_msa=use_precomuted_msa,
TypeError: all_seq_msa_features() got an unexpected keyword argument 'msa_output_dir'

The following command that I'm using inside my bash script is as follows:

[create_individual_features.py --fasta_paths=/scratch/user/AlphaPulldownTest/baits.fasta,\
/scratch/user/AlphaPulldownTest/sequences_shorter.fasta --data_dir=/vast/user/public/alphafold --save_msa_files=False \
--output_dir=/scratch/user/AlphaPulldownMSAOut --use_precomputed_msas=False --max_template_date=2050-01-01 --skip_existing=False \
--seq_index=$SLURM_ARRAY_TASK_ID]

I've tried a couple of configs for the settings, although have gotten no success. A little unsure what is meant by "msa_output_dir" in the error message. Many thanks beforehand!

Improve pi_score folder management

Change this to:

AlphaPulldown/alphapulldown/analysis_pipeline/get_good_inter_pae.py

Line 52 in 2086969

 subprocess.run(f"mkdir {workd_dir}/pi_score_outputs",shell=True,executable='/bin/bash') 

Path(path).mkdir(parents=True, exist_ok=True) and consider deleting pi_score_outputs before (as pi_score creates then new pi_score files, with new dates, and not sure if we control what would happen if multiple files are in the pi_score_outputs.

A question about the seq_index argument in job array mode

Hi,

Is it possible to use job arrays, but assign more than one prediction per job? I have ~400 interacting pairs and I would like to be able to request an array of 10 nodes to each predict complexes for 40 of them. If this doesn't already exist and you don't mind, I could implement the feature myself and make a pull request.

Best,
Sebastian

Add info to the manual that relax is not run

check with Greg about af2hyde

https://github.com/alphapulldown-devs/AlphaPulldown/blob/main/alphapulldown/analysis_pipeline/af2hyde_mod.py

Shall we rename it and remove any Hyde mentions as that was a library for the Hyde cluster? Clean remove all slurm things? Let him look at this file

pulldown mode job index actually starts from 2 not 1

The analysis step crashes with No module named 'jax'

[email protected]:/g/kosinski/kosinski/devel/AlphaPulldown$ singularity exec --no-home --bind /scratch/kosinski/testAlphaPulldown_models:/mnt \
>     /g/kosinski/kosinski/devel/AlphaPulldown/alpha-analysis.sif run_get_good_pae.sh --output_dir=/mnt --cutoff=5 --create_notebook=True
I0708 21:38:20.643415 139846045984576 get_good_inter_pae.py:135] now processing O43432_and_P09132
Traceback (most recent call last):
  File "/app/programme_notebook/get_good_inter_pae.py", line 171, in <module>
    app.run(main)
  File "/opt/conda/envs/programme_notebook/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/opt/conda/envs/programme_notebook/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/app/programme_notebook/get_good_inter_pae.py", line 141, in main
    seqs = pickle.load(open(result_path,'rb'))['seqs']
ModuleNotFoundError: No module named 'jax'

`--model_preset` flag is not used

Hi,

Is the model preset flag used when running alphafold multimer? I would like to be able to change the model weights, but as far as I can tell the argument is ignored. It seems like the multimer model is used whenever predicting the structure of a MultimericObject and otherwise the monomer_ptm model is used. Would it be possible to allow the other model configurations? I'm particularly interested in using monomer_ptm weights to predict complexes.

Recalculate template features only in local mmseqs2 mode

Here:

AlphaPulldown/alphapulldown/objects.py

Line 245 in 4703604

template_features = [self.mk_template(a3m_lines[0],

template_features are always recalculated, even in case of the remote mode, where template_features are calculated already.

Move lines 242-246 to if a3m_lines is not None: block?

Create requirements.txt

Hi,

nice application of AF2. For me, conda installation did not work, so I would like to try an other virtual environment. Could you please provide a requirements.txt file for installation?

Thank you very much!

example_data/baits.txt and example_data/candidates.txt are missing

Create instructions how to prepare MSAs with ColabFold?

Just to consider in the future:
Many users might want to use ColabFold for MSAs to be quicker. Shall we provide instructions how to do it?

Could we have an option for run_multimer_jobs.py to just print the number of all jobs?

Just to get the total number for slurm array script

CUDA out of memory error when running multimer

Hi,

I'm running run_multimer_jobs.py in pulldown mode with one bait protein and 100 candidate proteins. The bait is 300 amino acids and candidates are each 150 amino acids. I've been getting RuntimeError: INTERNAL: Failed to load in-memory CUBIN: CUDA_ERROR_OUT_OF_MEMORY: out of memory every time I run the script after about 6-12 bait-candidate pairs. I can manually delete the completed runs and resume, but it's time-consuming to have to watch the script. Is there a memory leak somewhere that can be remedied?

I have an RTX 3090 with 24GB of GPU RAM.

run_get_good_pae.sh crashes in homo-oligomeric mode when monomers are present

Command:

singularity exec --no-home --bind /scratch/kosinski/Giardia/interactome/Group1/homooligomers:/mnt     
/g/kosinski/kosinski/devel/AlphaPulldown/alpha-analysis.sif run_get_good_pae.sh --output_dir=/mnt --cutoff=5

Error:

Traceback (most recent call last):
  File "/app/programme_notebook/get_good_inter_pae.py", line 136, in <module>
    app.run(main)
  File "/opt/conda/lib/python3.9/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/opt/conda/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/app/programme_notebook/get_good_inter_pae.py", line 110, in main
    iptm_ptm_score = json.load(open(os.path.join(result_subdir,"ranking_debug.json"),'rb'))['iptm+ptm'][best_model]
KeyError: 'iptm+ptm'

Where does it read template alignments?

Nothing happens after this line?

AlphaPulldown/alphapulldown/objects.py

Line 189 in 6f5b7d0

 logging.info(f"Finished parsing the precalculated a3m_file\nNow will search for template in local {self.description}_env") 

--use_precomputed_msas=False is confusing here

In:

AlphaPulldown/mmseqs2_manual.md

Line 78 in 360a391

--use_precomputed_msas=False \

--use_precomputed_msas=False is rather confusing because the script actually does use use precomputed msas here. Remove this option from the manual then.

So what is the logic with --mseqs2 now? Is this correct?
Regardless of --use_precomputed_msas option
it will check if a3m exists
and if it doesn't
it will run remote mmseqs2
if it does
it will take local a3m

So --use_precomputed_msas has no effect for this mode, should it be somehow blocked then? Or when --use_precomputed_msas=False - always run remote (requiring users to set --use_precomputed_msas=True if local a3ms are to be used?). The latter seems most logical to me.

Requesting feature: ability to skip MSA generation step

Hi,

I'm currently running the example and everything seems to be going smoothly. I'm interested in using this in pulldown mode, where I have a natural protein and a list of peptides which are designed or otherwise synthetic. As the peptides have no homolog there is little reason to spend time generating MSAs for each. Would it be possible to control for which proteins the HMM search is performed during feature generation? I'm not sure whether it would be easier to create dummy MSA files (which are empty) or to just modify the run_multimer_jobs.py to handle the case where a peptide has no MSA, but either would work fine for me.

Respectfully,
Sebastian

Installing the project as a package in developer mode

Hi,

I'd like to test out some changes and additional arguments to the scripts in this repo. I made a fork of the repo, made some changes to the code, and then tried to install the repo as a package so I could verify that the changes worked. I tried making a very simple setup.py file and then installing it as package into my conda environment with pip install -e .. I tried testing the package by running the unaltered code, but I got an odd error (will attach for reference at the bottom of this message), which I suspect is due to an alphafold version mismatch.

What is the easiest way for me to install my forked repo in the same way as the official AlphaPulldown package? Is there a setup.py or other file that I could use to ensure that it's installed in the same way?

In case it's informative, here's the error

 File "/home/gridsan/sswanson/miniconda3/envs/AlphaPulldown/bin/run_multimer_jobs.py", line 7, in <module>
 exec(compile(f.read(), __file__, 'exec'))
 File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/run_multimer_jobs.py", line 333, in <module>
 app.run(main)
 File "/home/gridsan/sswanson/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 308, in run
 _run_main(main, args)
 File "/home/gridsan/sswanson/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 254, in _run_main
 sys.exit(main(argv))
 File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/run_multimer_jobs.py", line 329, in main
 predict_multimers(multimers)
 File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/run_multimer_jobs.py", line 279, in predict_multimers
 random_seed=random_seed,
 File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/run_multimer_jobs.py", line 252, in predict_individual_jobs
 seqs=multimer_object.input_seqs,
 File "/home/gridsan/sswanson/local_code_mirror/AlphaPulldown/alphapulldown/predict_structure.py", line 62, in predict
 prediction_result.update({"seqs": seqs})
AttributeError: 'tuple' object has no attribute 'update'```

Do not save output in the notebook?

When I use the notebook and execute cells, the output (PAE plots, 3d renderings) are saved in the notebook automatically. If the notebook has many entries, it becomes unusable at some point. Is it possible in jupyter, dunno, by adding some directive on top of the notebook, to prevent auto-saving of output?

Print sequence name in the log output in create_individual_features.py

Otherwise difficult to trace crashed jobs from the log files...

Use correct residue numbering in the PAE plots

foss + gompic doesn't work

These two modules are not compatible in my tests:

module load HMMER/3.1b2-foss-2016b
module load HH-suite/3.3.0-gompic-2020b

Have you tested and works for you? I replaced gompic with foss and runs but with some warnings.

data_dir really not needed?

Have you tested data_dir is not needed? Isn't data_dir used to locate hhsearch database for template search?

AlphaPulldown/mmseqs2_manual.md

Line 24 in 4be1ecd

 ```max_template_date``` and ```data_dir``` are not needed here. This part of programme in AlphaPulldown is built upon ColabFold, in which maximum template date is hardcoded. 

Create msa_output_dir if does not exist in make_features

Can you create this folder:
https://github.com/henrywotton/AlphaPulldown/blob/37c2b1c2b25ded268e37f6ab11e418fd7ecbb7cf/alphapulldown/objects.py#L134
if does not exist, after this line? Otherwise the pipeline does not work properly if I set use_existing_msas=True but the folder doesn't exist yet (e.g. when re-running an array with some jobs partially crashed). With use_existing_msas but MSAs absent, AlphaFold would still run and generate the missing MSAs, but here it cannot because the parent folder does not exist.

The option should be --mmseqs2 instead of --mmseq2

Missing "s".

pip version out of sync?

In the pip version of alphapulldown Path(msa_output_dir).mkdir(parents=True, exist_ok=True) is not present in make_features ... (ba913a1).

Interpretation of output

Hi,

Thanks for your help! I was successfully running the pipeline with pulldown mode followed the example1 with one bait sequence against 30,000 candidates sequences. There are multiple pdb file as output. I am wondering if all these pdb are the structure of interaction between the bait and candidates? No pdb is only for the bait or candidate? Do I need to run alphafold again to get the structure for each candidates?

Thanks,

Ning

seq_index is not working

Add option to zip MSA files

Add a command-line option to zip the output MSAs in create_individual_features.py if --save_msa_files is True

PAE plots crashing

It worked for a couple of jobs but for most it crashes with:

Traceback (most recent call last):
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/bin/run_multimer_jobs.py", line 329, in <module>
    app.run(main)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/bin/run_multimer_jobs.py", line 325, in main
    predict_multimers(multimers)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/bin/run_multimer_jobs.py", line 289, in predict_multimers
    random_seed=random_seed,
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/bin/run_multimer_jobs.py", line 250, in predict_individual_jobs
    create_and_save_pae_plots(multimer_object, output_path)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/alphapulldown/utils.py", line 182, in create_and_save_pae_plots
    multimer_object.input_seqs, order, output_dir, multimer_object.description
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/alphapulldown/plot_pae.py", line 35, in plot_pae
    fig, ax1 = plt.subplots(1, 1)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/cbook/deprecation.py", line 451, in wrapper
    return func(*args, **kwargs)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/pyplot.py", line 1287, in subplots
    fig = figure(**fig_kw)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/pyplot.py", line 693, in figure
    **kwargs)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/pyplot.py", line 315, in new_figure_manager
    return _backend_mod.new_figure_manager(*args, **kwargs)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/backend_bases.py", line 3494, in new_figure_manager
    return cls.new_figure_manager_given_figure(num, fig)
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/site-packages/matplotlib/backends/_backend_tk.py", line 885, in new_figure_manager_given_figure
    window = tk.Tk(className="matplotlib")
  File "/g/kosinski/kosinski/software/envs/TestAlphaPulldown/lib/python3.7/tkinter/__init__.py", line 2020, in __init__
    self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: couldn't connect to display ":0"

Maybe you need to set matplotlib backend explicitly?

Smaller example

I think the translation example should be smaller, to let people test more easily. What about taking top 10 based on iPTM and bottom 5 from the google sheet list, and 5 from those without any good PAE?

jackhmmer fails to run

Hi, I'm trying to run pulldown for the first time and I am unable to get the example to work. It seems like an issue with alphafold's jackhmmer script but I'm not sure how to debug. Thanks!

(AlphaPulldown) kduong@glycine:~/AlphaPulldown$ python3 alphapulldown/create_individual_features.py --fasta_paths=example_data/example_1_sequences_shorter.fasta --data_dir=/home/kduong/af2_databases/ --save_msa_files=False --output_dir=/home/kduong/pulldown_input_output/output/ --use_precomputed_msas=False --max_template_date=2050-01-01
2022-08-24 15:53:43.371134: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
I0824 15:53:45.712832 140448664971072 templates.py:857] Using precomputed obsolete pdbs /home/kduong/af2_databases/pdb_mmcif/obsolete.dat.
I0824 15:53:45.718152 140448664971072 objects.py:112] You have chosen not to save msa output files
Traceback (most recent call last):
  File "alphapulldown/create_individual_features.py", line 247, in <module>
    app.run(main)
  File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "alphapulldown/create_individual_features.py", line 240, in main
    create_and_save_monomer_objects(curr_monomer, pipeline, flags_dict)
  File "alphapulldown/create_individual_features.py", line 204, in create_and_save_monomer_objects
    save_msa=FLAGS.save_msa_files,
  File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphapulldown/objects.py", line 118, in make_features
    input_fasta_path=fasta_file, msa_output_dir=tmpdirname
  File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/data/pipeline.py", line 169, in process
    max_sto_sequences=self.uniref_max_hits)
  File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/data/pipeline.py", line 94, in run_msa_tool
    result = msa_runner.query(input_fasta_path, max_sto_sequences)[0]  # pytype: disable=wrong-arg-count
  File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/data/tools/jackhmmer.py", line 172, in query
    input_fasta_path, self.database_path, max_sequences)
  File "/home/kduong/miniconda3/envs/AlphaPulldown/lib/python3.7/site-packages/alphafold/data/tools/jackhmmer.py", line 133, in _query_chunk
    logging.info('Launching subprocess "%s"', ' '.join(cmd))
TypeError: sequence item 0: expected str instance, NoneType found

How to speed up the first step for model prediction?

Hi,

Can I allocate more CPUs to the first step of prediction? I can't find the argument to assign more CPUs. I have over 30,000 proteins that need to be predicted, it still needs to take days even though I have split it up into 200 sub-jobs. Can I have some suggestions?

Thanks,

Ning

Remove p-value from analysis table

I think it does not correlate with anything, is confusing and useless.

kosinskilab / alphapulldown Goto Github PK

alphapulldown's People

Contributors

Stargazers

Watchers

Forkers

alphapulldown's Issues

Recommend Projects

Recommend Topics

Recommend Org