pirovc / metameta Goto Github PK

View Code? Open in Web Editor NEW

23.0 23.0 10.0 14.57 MB

License: Other

Python 66.94% Shell 33.06%

binning metagenomics pipeline profiling snakemake taxonomy

metameta's People

Stargazers

Watchers

Forkers

bioshell marcelmat12 yahiabioinfo senaj ropolomx samnooij jtembrockhaus debsin zzygyx9119 karolno

metameta's Issues

Convert final metametamerge output files into .biom format

Hi Vitor,

Is there a way I could convert the final metametamerge.profile.out.detailed file into a .biom file for import into QIIME for analysis?

Best wishes,
Muslih.

Can't download the pre-config database

I try to reproduce the result of sample data after installed the metameta with the command you provided and there are some error when the programme download the pre-config database

RuleException:
CalledProcessError in line 15 of /anaconda3/envs/metameta/opt/metameta/scripts/preconfigdb.sm:
Command '  curl -L -o /some_tests/metameta/databasearchaea_bacteria_201503/kaiju.tar.gz https://zenodo.org/record/819425/files/kaiju_bac_arc_v1.tar.gz > /some_tests/metameta/databasearchaea_bacteria_201503/log/kaiju_db_archaea_bacteria_201503_download.log 2>&1 ' returned non-zero exit status 18.
  File "/anaconda3/envs/metameta/opt/metameta/scripts/preconfigdb.sm", line 15, in __rule_preconfigdb_download
  File "/anaconda3/envs/metameta/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Exiting because a job execution failed. Look above for error message

So I tried to download the database with the command
wget -c https://zenodo.org/record/819425/files/kaiju_bac_arc_v1.tar.gz but it return this Unable to establish SSL connection. so does the clark_bac_arc_v1.tar.gz...

Is there any alternative way to download the pre-config database

Thanks

Unable to set utime on symlink K1T0-B/reads/pre1.1.fq.gz. Your Python build does not support it.

Hi Vitor,

I run into a few issues with metameta, when trying it out on my data. During the metameta run, I get the message, repeatedly:
Unable to set utime on symlink K1T0-B/reads/pre1.1.fq.gz. Your Python build does not support it.

It seems that this is happening with all the files in the reads folder.
Should I ignore this or not? I thought the miniconda came with it's own python.

** The background.**
I have installed metameta on my area of the Abel computing cluster which uses slurm.

Installed miniconda in this way:

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
 chmod +x Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh

I then made sure than no python2 / perlmodules were in my PATH

Next I added miniconda to the PATH.

export PATH=/usit/abel/u1/thhaverk/miniconda3/bin:$PATH

And then I installed metameta with the command:

conda create -c bioconda -n metameta metameta=1.1 bowtie2=2.3.0 clark=1.2.3 dudes=0.07 gottcha=1.0 jellyfish=1.1.11 kaiju=1.0 kraken=0.10.5beta krona=2.7 metametamerge=1.1 motus=1.0 spades=3.9.0 trimmomatic=0.36

With that step, I could start the metameta environment and I could run it.

I created a slurm script to start the metameta analysis, where I call metameta with the command:

metameta --configfile MM_config_3.yaml --keep-going -j 999 --cluster-config MM_cluster.json --cluster "sbatch --account=nn9244k --mem-per-cpu={cluster.mem} --partition={cluster.partition} --cpus-per-task={cluster.cpus-per-task} --time={cluster.time} --job-name={cluster.job-name} --output={cluster.output}"

The first bit of output from metameta:

Starting job 18309009 ("MetaMeta") on c16-12 at Mon Sep 11 10:37:24 CEST 2017
setting-up metameta environment
/work/users/thhaverk/scripts
activating metameta environment
Provided cluster nodes: 999
Job counts:
count jobs
1 all
1 clark_rpt
1 clark_run_1
6 clean_files
1 clean_reads
1 dudes_rpt
1 dudes_run_1
1 dudes_run_2
1 errorcorr_reads
1 gottcha_rpt
1 gottcha_run_1
1 kaiju_rpt
1 kaiju_run_1
1 kraken_rpt
1 kraken_run_1
1 krona
1 metametamerge
1 motus_rpt
1 motus_run_1
1 subsample_reads
1 trim_reads
26
rule trim_reads:
input: /projects/cees/in_progress/anthrax/data/clean_data/K1T0-B/K1T0-B.pair_1.fastq.gz
output: K1T0-B/reads/pre1.1.fq.gz
log: K1T0-B/log/trim_reads.log
benchmark: K1T0-B/log/trim_reads.time
wildcards: sample=K1T0-B
threads: 16

Unable to set utime on symlink K1T0-B/reads/pre1.1.fq.gz. Your Python build does not support it.
1 of 26 steps (4%) done
rule errorcorr_reads:
input: K1T0-B/reads/pre1.1.fq.gz
output: K1T0-B/reads/pre2.1.fq.gz
log: K1T0-B/log/errorcorr_reads.log
benchmark: K1T0-B/log/errorcorr_reads.time
wildcards: sample=K1T0-B
threads: 16

Unable to set utime on symlink K1T0-B/reads/pre2.1.fq.gz. Your Python build does not support it.
2 of 26 steps (8%) done

Conda environments are only allowed with shell, script or wrapper directives (not with run). - preproc.sm", line 27

Hi,

I installed MetaMeta according to the description on the Github page, but when I try to start the program it fails with the following error:

RuleException in line 27 of /home/stephan/work/miniconda/data/opt/metameta/scripts/preproc.sm:
Conda environments are only allowed with shell, script or wrapper directives (not with run).
  File "/home/stephan/work/miniconda/data/opt/metameta/Snakefile", line 24, in <module>
  File "/home/stephan/work/miniconda/data/opt/metameta/scripts/preproc.sm", line 27, in <module>

Do you know how to fix it?

Thanks,
Stephan

Can't install metameta with conda

I try to install metameta with the commond conda install metameta=1.2.0 but it just show me Error: No packages found matching: metameta 1.2|1.2.0* or Error: No packages found matching: metameta

So how can I install this package properly

Error in job metametamerge while creating output

Hi Vitor,

I have made custom database following your example using only fungi sequences. Next I classified my sequences with this database and this ran until almost the end. So databases were created for the different tools and the classification for each tool was run.

I then ran into a problem with the metametamerge.sm script.

This is the rule that was created for the step:


rule metametamerge:
    input: /work/users/thhaverk/MM_databases/fungi_database/kaiju_db_check.done, /work/users/thhaverk/MM_databases/fungi_
database/clark_db_check.done, /work/users/thhaverk/MM_databases/fungi_database/kraken_db_check.done, /work/users/thhaverk
/MM_databases/fungi_database/dudes_db_check.done, /work/users/thhaverk/MM_databases/names.dmp, K1T0/profiles/fungi_databa
se/kaiju.profile.out, K1T0/profiles/fungi_database/clark.profile.out, K1T0/profiles/fungi_database/kraken.profile.out, K1
T0/profiles/fungi_database/dudes.profile.out, /work/users/thhaverk/MM_databases/merged.dmp, /work/users/thhaverk/MM_datab
ases/nodes.dmp, K1T0/profiles/fungi_database/kaiju_clean_files.done, K1T0/profiles/fungi_database/clark_clean_files.done,
 K1T0/profiles/fungi_database/kraken_clean_files.done, K1T0/profiles/fungi_database/dudes_clean_files.done
    output: K1T0/metametamerge/fungi_database/final.metametamerge.profile.out
    log: K1T0/log/fungi_database/metametamerge.log
    benchmark: K1T0/log/fungi_database/metametamerge.time
    wildcards: sample=K1T0, database=fungi_database

This is the error that I get:

Error in job metametamerge while creating output file K1T0-B/metametamerge/fungi_database/final.metametamerge.profile.out
.
ClusterJobException in line 12 of /usit/abel/u1/thhaverk/miniconda3/envs/metameta/opt/metameta/scripts/metametamerge.sm:
Error executing rule metametamerge on cluster (jobid: 1, jobscript: /work/users/thhaverk/MMresults_test/.snakemake/tmp.ox
ye9o6w/snakejob.metametamerge.1.sh). For detailed error see the cluster log.
Job failed, going on with independent jobs.
Error in job metametamerge while creating output file K1T0/metametamerge/fungi_database/final.metametamerge.profile.out.
ClusterJobException in line 12 of /usit/abel/u1/thhaverk/miniconda3/envs/metameta/opt/metameta/scripts/metametamerge.sm:
Error executing rule metametamerge on cluster (jobid: 0, jobscript: /work/users/thhaverk/MMresults_test/.snakemake/tmp.ox
ye9o6w/snakejob.metametamerge.0.sh). For detailed error see the cluster log.
Job failed, going on with independent jobs.
Exiting because a job execution failed. Look above for error message

So next I look at the clusterlog for this job file: metametamerge.18496961.out

Error in job metametamerge while creating output file K1T0/metametamerge/fungi_database/final.metametamerge.profile.out.
RuleException:
CalledProcessError in line 31 of /usit/abel/u1/thhaverk/miniconda3/envs/metameta/opt/metameta/scripts/metametamerge.sm:
Command 'MetaMetaMerge.py --input-files K1T0/profiles/fungi_database/dudes.profile.out K1T0/profiles/fungi_database/clark.profile.out K1T0/profiles/fungi_database/kaiju.profile.out K1T0/profiles/fungi_database/kraken.profile.out --database-profiles /work/users/thhaverk/MM_databases/fungi_database/dudes.dbprofile.out /work/users/thhaverk/MM_databases/fungi_database/clark.dbprofile.out /work/users/thhaverk/MM_databases/fungi_database/kaiju.dbprofile.out /work/users/thhaverk/MM_databases/fungi_database/kraken.dbprofile.out --tool-identifier 'dudes,clark,kaiju,kraken' --tool-method 'p,b,b,b' --names-file /work/users/thhaverk/MM_databases/names.dmp --nodes-file /work/users/thhaverk/MM_databases/nodes.dmp --merged-file /work/users/thhaverk/MM_databases/merged.dmp --bins 4 --cutoff 0.0001 --mode 'linear' --ranks 'species' --output-file K1T0/metametamerge/fungi_database/final.metametamerge.profile.out   --output-parsed-profiles > K1T0/log/fungi_database/metametamerge.log 2>&1' returned non-zero exit status 1
  File "/usit/abel/u1/thhaverk/miniconda3/envs/metameta/opt/metameta/scripts/metametamerge.sm", line 31, in __rule_metametamerge
  File "/usit/abel/u1/thhaverk/miniconda3/envs/metameta/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Exiting because a job execution failed. Look above for error message

So with this I am not sure what is happening here, so I also checked the metametamerge.log file:

- - - - - - - - - - - - - - - - - - - - -
           MetaMetaMerge 1.1
- - - - - - - - - - - - - - - - - - - - -
Input files: 
 dudes (p) K1T0/profiles/fungi_database/dudes.profile.out /work/users/thhaverk/MM_databases/fungi_database/dudes.dbprofile.out
 clark (b) K1T0/profiles/fungi_database/clark.profile.out /work/users/thhaverk/MM_databases/fungi_database/clark.dbprofile.out
 kaiju (b) K1T0/profiles/fungi_database/kaiju.profile.out /work/users/thhaverk/MM_databases/fungi_database/kaiju.dbprofile.out
 kraken (b) K1T0/profiles/fungi_database/kraken.profile.out /work/users/thhaverk/MM_databases/fungi_database/kraken.dbprofile.out
Taxonomy: 
 /work/users/thhaverk/MM_databases/names.dmp, /work/users/thhaverk/MM_databases/nodes.dmp, /work/users/thhaverk/MM_databases/merged.dmp
Bins: 4
Cutoff: 0.0001
Mode: linear
Ranks: species
Output file (type): K1T0/metametamerge/fungi_database/final.metametamerge.profile.out (bioboxes)
Verbose: False
Detailed: False

- - - - - - - - - - - - - - - - - - - - -

Parsing taxonomy (names, nodes, merged) ... 

Reading database profiles ...
 - /work/users/thhaverk/MM_databases/fungi_database/dudes.dbprofile.out (tsv)
        species - 0 entries (0 ignored)
        (WARNING) no valid entries found [species]
Traceback (most recent call last):
  File "/usit/abel/u1/thhaverk/miniconda3/envs/metameta/bin/MetaMetaMerge.py", line 338, in <module>
    main()
  File "/usit/abel/u1/thhaverk/miniconda3/envs/metameta/bin/MetaMetaMerge.py", line 113, in main
    db = Databases(database_file, parse_files(database_file, 'db', all_names_scientific, all_names_other, nodes, merged, ranks, args.verbose), ranks)           
  File "/usit/abel/u1/thhaverk/miniconda3/envs/metameta/lib/python3.5/site-packages/metametamerge/Databases.py", line 8, in __init__
    Profile.__init__(self, profile, ranks)
  File "/usit/abel/u1/thhaverk/miniconda3/envs/metameta/lib/python3.5/site-packages/metametamerge/Profile.py", line 15, in __init__
    self.profilerank[rankid] = ProfileRank(profile[np.ix_(profile[:,1]==rankid, [0,2,3])],rankid,sum(profile[:,1]==rankid))
IndexError: too many indices for array

The output mentions that no valied entries were found. But the dudes.profile.out file contains a tabular output of classifications. I changed the ranks in the yaml file to genus, but that gave the same error. Next I tried to not use the output from dudes to see if there was the problem, but then clark gave the same error. So I think it is not depending on the tool but on the parsing of the profile files.

So the error has to do with the last line from the metametamerge.log. Any idea on what is causing the error and how to solve it?

Combining output from samples?

Hey Vitor,

this is a general question. After running metameta, I get for each sample a metameta profile. I am not sure if I have missed in the the set-up of the yaml file, but is there a way to combine the profile files for each sample into one big tabular file? Or is that not implemented in the metameta workflow?

Best
Thomas

Error in job clean_files while creating output file...

Hi Vitor,

When I am running the mm pipeline I encountered the following error.

Error in job clean_files while creating output file K1T0/profiles/archaea_bacteria/dudes_clean_files.done. ClusterJobException in line 2 of /usit/abel/u1/thhaverk/miniconda3/envs/metameta/opt/metameta/scripts/clean_files.sm: Error executing rule clean_files on cluster (jobid: 9, jobscript: /work/users/thhaverk/MMresults_test/.snakemake/tmp.odfh3g06/snakejob.clean_files.9.sh). For detailed error see the cluster log. Job failed, going on with independent jobs.

The error is repeated with each running of the rule.

So I check the clusterlog file and I find the following:

Error in job clean_files while creating output file K1T0/profiles/archaea_bacteria/dudes_clean_files.done. RuleException: CalledProcessError in line 13 of /usit/abel/u1/thhaverk/miniconda3/envs/metameta/opt/metameta/scripts/clean_files.sm: Command 'if [ -d "K1T0/dudes/" ]; then if [ ! "$(ls -A K1T0/dudes/)" ]; then rm -dv K1T0/dudes/; fi; fi >> K1T0/log/archaea_bacteria/dudes_clean_files.log 2>&1' returned non-zero exit status 1 File "/usit/abel/u1/thhaverk/miniconda3/envs/metameta/opt/metameta/scripts/clean_files.sm", line 13, in __rule_clean_files File "/usit/abel/u1/thhaverk/miniconda3/envs/metameta/lib/python3.5/concurrent/futures/thread.py", line 55, in run Exiting because a job execution failed. Look above for error message

I only see this error when setting the keepfiles option to "0" in the yaml file. When I set it to keep files, no errors occur.

I looked at the other issue, but I am not sure if I am dealing here with a similar issue.

metameta is run from a slurm with the command:

metameta --configfile MM_config_4.yaml --keep-going -j 999 --cluster-config MM_cluster.json --cluster "sbatch --account=nn9244k --mem-per-cpu={cluster.mem} --partition={cluster.partition} --cpus-per-task={cluster.cpus-per-task} --time={cluster.time} --job-name={cluster.job-name} --output={cluster.output}"

the cluster settings for the clean_files job are:

"clean_files": { "job-name": "clean_f", "output": "clusterlog/clean_files.%j.out", "mem": "3800M", "cpus-per-task": 1, "time": "2:0:0"

Cheers
Thomas

preinstall databases? (and install them somewhere specifically?)

I see in the README that metameta will download the databases on first run.

I am preparing the software in an environment where the users do not have privileges to write where it will be installed. I don't have what's needed to run the tool, so I need some means to kick off this download manually.

I'd also like to know where these get stashed away, because they seem quite large. Is one able to place them elsewhere, and inform metameta of their location?

I've attempted to puzzle out how this works in the source, but I can't make heads or tails of it.

Building custom database issue

Hi,

I'd like to build a custom database for some of NCBI refseq sequences.
Before that, I attempted to build a very small database, but It was failed.

Here is the directory structure for building the database, the configure file and log file.

======= directory structure =======
./db
./db/clark
./db/clark/genomes.fna
./db/dudes
./db/dudes/genomes.fna
./db/kaiju
./db/kaiju/genome.gbff
./db/kraken
./db/kraken/genome.fna

======= config file =======
workdir: "/mss2/projects/META2/taxonomy_classification/metameta"

databases:

custom_db

custom_db:
clark: "/mss2/projects/META2/taxonomy_classification/metameta/db/clark"
dudes: "/mss2/projects/META2/taxonomy_classification/metameta/db/dudes"
kaiju: "/mss2/projects/META2/taxonomy_classification/metameta/db/kaiju"
kraken: "/mss2/projects/META2/taxonomy_classification/metameta/db/kraken"

dbdir: "/mss2/projects/META2/taxonomy_classification/metameta/db"

samples:
"TEST":
fq1: "test1_1.fq.gz"
fq2: "test1_2.fq.gz"

gzipped: 1
threads: 50

======= Log file =======
Building DAG of jobs...
Provided cores: 5
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 clark_db_custom_1
1 clark_db_custom_2
1 clark_db_custom_3
1 clark_db_custom_4
1 clark_db_custom_check
1 clark_db_custom_profile
1 clark_rpt
1 clark_run_1
4 clean_files
1 clean_reads
4 database_profile
1 dudes_db_custom_1
1 dudes_db_custom_2
1 dudes_db_custom_3
1 dudes_db_custom_check
1 dudes_db_custom_profile
1 dudes_rpt
1 dudes_run_1
1 dudes_run_2
1 errorcorr_reads
1 get_accession2taxid
1 get_gi_taxid_nucl
1 get_taxdump
1 kaiju_db_custom_1
1 kaiju_db_custom_2
1 kaiju_db_custom_3
1 kaiju_db_custom_4
1 kaiju_db_custom_check
1 kaiju_db_custom_profile
1 kaiju_rpt
1 kaiju_run_1
1 kraken_db_custom_1
1 kraken_db_custom_2
1 kraken_db_custom_3
1 kraken_db_custom_check
1 kraken_db_custom_profile
1 kraken_rpt
1 kraken_run_1
1 krona
1 metametamerge
1 subsample_reads
1 trim_reads
49

MetaMeta Pipeline v1.2.0 by Vitor C. Piro ([email protected], http://github.com/pirovc)

Parameters:

bins: 4
custom_db: OrderedDict([('clark', 'db/clark'),
('dudes', 'db/dudes'),
('kaiju', 'db/kaiju'),
('kraken', 'db/kraken')])
cutoff: 0.0001
databases: ['custom_db']
dbdir: 'db'
desiredminlen: 70
detailed: 0
errorcorr: 0
gzipped: 1
keepfiles: 0
mode: 'linear'
ranks: 'species'
replacement: 0
samples: OrderedDict([('TEST',
OrderedDict([('fq1', 'test1_1.fq.gz'),
('fq2', 'test1_2.fq.gz')]))])
samplesize: 1
strictness: 0.8
subsample: 0
threads: 50
tool_alt_path: {'bowtie2': '',
'clark': '',
'dudes': '',
'gottcha': '',
'kaiju': '',
'kraken': '',
'krona': '',
'metametamerge': '',
'motus': '',
'spades': '',
'trimmomatic': ''}
tools: {'clark': 'b',
'dudes': 'p',
'gottcha': 'p',
'kaiju': 'b',
'kraken': 'b',
'motus': 'p'}
trimming: 0
verbose: 0
workdir: '.'

rule kaiju_db_custom_1:
output: dbcustom_db/kaiju_db/kaiju_db.faa
log: dbcustom_db/log/kaiju_db_custom_1.log
jobid: 48
benchmark: dbcustom_db/log/kaiju_db_custom_1.time
wildcards: database=custom_db

rule get_gi_taxid_nucl:
output: dbtaxonomy/gi_taxid_nucl.dmp.gz
log: dbtaxonomy/log/get_gi_taxid_nucl.log
jobid: 44
benchmark: dbtaxonomy/log/get_gi_taxid_nucl.time

rule get_taxdump:
output: dbtaxonomy/taxdump.tar.gz, dbtaxonomy/names.dmp, dbtaxonomy/nodes.dmp, dbtaxonomy/merged.dmp
log: dbtaxonomy/log/get_taxdump.log
jobid: 4
benchmark: dbtaxonomy/log/get_taxdump.time

rule get_accession2taxid:
output: dbtaxonomy/nucl_gb.accession2taxid.gz, dbtaxonomy/nucl_wgs.accession2taxid.gz
log: dbtaxonomy/log/get_accession2taxid.log
jobid: 47
benchmark: dbtaxonomy/log/get_accession2taxid.time

Activating conda environment /mss2/projects/META2/taxonomy_classification/metameta/.snakemake/conda/0e3e8e78.

rule clark_db_custom_1:
output: dbcustom_db/clark_db/Custom/
log: dbcustom_db/log/clark_db_custom_1.log
jobid: 40
benchmark: dbcustom_db/log/clark_db_custom_1.time
wildcards: database=custom_db

Finished job 40.
1 of 49 steps (2%) done

rule kaiju_db_custom_profile:
output: dbcustom_db/kaiju.dbaccession.out
log: dbcustom_db/log/kaiju_db_custom_profile.log
jobid: 36
benchmark: dbcustom_db/log/kaiju_db_custom_profile.time
wildcards: database=custom_db

Finished job 36.
2 of 49 steps (4%) done
Finished job 48.
3 of 49 steps (6%) done
Exiting because a job execution failed. Look above for error message
Will exit after finishing currently running jobs.
Finished job 44.
4 of 49 steps (8%) done
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message

An error has occured.
Please check the main log file for more information:
/mss2/projects/META2/taxonomy_classification/metameta/metameta_2019-08-15_23-59-47.log
Detailed output and execution time for each rule can be found at:
/mss2/projects/META2/taxonomy_classification/metameta/db/log/
/mss2/projects/META2/taxonomy_classification/metameta/SAMPLE_NAME/log/

=======

How can I build a custom database?
What did I miss?

Thank you,
Jongin

Built in database refseq version

Hi Pirovc,

Can I know which version of the Refseq database you used to compose the built-in bacterial archeal database?

Metameta fails with example file

Hi,

I tested metameta now with an example file, and it also fails at the metameta step:

run.sh

metameta --configfile test.yaml --use-conda --keep-going --cores 5

*) test.yaml

workdir: "/home/stephan/work/10_meta_meta/output/"
dbdir: "/home/stephan/work/01_meta_meta_db/"

samples:
  "sample1":
    fq1: "/home/stephan/work/miniconda/data/opt/metameta/sampledata/files/reads/sample_data_archaea_bacteria.1.fq.gz"


databases:
  - archaea_bacteria_201503
tools:
    "clark": "p"
    "kaiju": "p"

threads: 30
gzipped: 1
keepfiles: 1

metametamerge.log

- - - - - - - - - - - - - - - - - - - - -
           MetaMetaMerge 1.1
- - - - - - - - - - - - - - - - - - - - -
Input files:
 clark (p) sample1/profiles/archaea_bacteria_201503/clark.profile.out /home/stephan/work/01_meta_meta_db/archaea_bacteria_201503/clark.dbprofile.out
 kaiju (p) sample1/profiles/archaea_bacteria_201503/kaiju.profile.out /home/stephan/work/01_meta_meta_db/archaea_bacteria_201503/kaiju.dbprofile.out
Taxonomy:
 /home/stephan/work/01_meta_meta_db/taxonomy/names.dmp, /home/stephan/work/01_meta_meta_db/taxonomy/nodes.dmp, /home/stephan/work/01_meta_meta_db/taxonomy/merged.dmp
Bins: 4
Cutoff: 0.0001
Mode: linear
Ranks: species
Output file (type): sample1/metametamerge/archaea_bacteria_201503/final.metametamerge.profile.out (bioboxes)
Verbose: False
Detailed: False
- - - - - - - - - - - - - - - - - - - - -

Parsing taxonomy (names, nodes, merged) ...

Reading database profiles ...
 - /home/stephan/work/01_meta_meta_db/archaea_bacteria_201503/clark.dbprofile.out (tsv)
        species - 1461 entries (0 ignored)
        2 taxons with merged entries [94694,1905730]
        Total - 1459 taxons
 - /home/stephan/work/01_meta_meta_db/archaea_bacteria_201503/kaiju.dbprofile.out (tsv)
        species - 2419 entries (0 ignored)
        4 taxons with merged entries [1405,94694,1334193,1905730]
        Total - 2415 taxons

Reading profiles ...
 - sample1/profiles/archaea_bacteria_201503/clark.profile.out (tsv)
        species - 0 entries (0 ignored)
        (WARNING) no valid entries found [species]
Traceback (most recent call last):
  File "/mnt/vdb1/stephan/10_meta_meta/output/.snakemake/conda/266fca67/bin/MetaMetaMerge.py", line 338, in <module>
    main()
  File "/mnt/vdb1/stephan/10_meta_meta/output/.snakemake/conda/266fca67/bin/MetaMetaMerge.py", line 131, in main
    tool = Tools(input_file, identifiers[idx], methods[idx], parse_files(input_file, methods[idx], all_names_scientific, all_names_other, nodes, merged, ranks, args.verbose), ranks, args.verbose)
  File "/mnt/vdb1/stephan/10_meta_meta/output/.snakemake/conda/266fca67/lib/python3.5/site-packages/metametamerge/Tools.py", line 11, in __init__
    Profile.__init__(self, profile, ranks)
  File "/mnt/vdb1/stephan/10_meta_meta/output/.snakemake/conda/266fca67/lib/python3.5/site-packages/metametamerge/Profile.py", line 15, in __init__
    self.profilerank[rankid] = ProfileRank(profile[np.ix_(profile[:,1]==rankid, [0,2,3])],rankid,sum(profile[:,1]==rankid))
IndexError: too many indices for array

Thanks for your help in advance,
Stephan

Question on interpreting the final metametamerge output file.

In the final final.metametamerge.profile.out.detailed file. What does the value -1 mean? I understand 0 is absent and between 0 and 1 are the abundance obtained using a specific tool.

Custom database does not work

Hi,

I tried creating a new custom database, but when running the pipeline I get the following error:

Rules claiming more threads will be scaled down.
Job counts:
	count	jobs
	1	all
	1	clark_db_custom_1
	1	clark_db_custom_2
	1	clark_db_custom_3
	1	clark_db_custom_check
	1	clark_db_custom_profile
	1	clark_rpt
	1	clark_run_1
	3	clean_files
	1	clean_reads
	3	database_profile
	1	kaiju_db_custom_2
	1	kaiju_db_custom_3
	1	kaiju_db_custom_check
	1	kaiju_db_custom_profile
	1	kaiju_rpt
	1	kaiju_run_1
	1	kraken_db_custom_1
	1	kraken_db_custom_3
	1	kraken_db_custom_check
	1	kraken_db_custom_profile
	1	kraken_rpt
	1	kraken_run_1
	1	krona
	1	metametamerge
	29
rule kaiju_db_custom_2:
    input: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.faa
    output: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.bwt, /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.sa
    log: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_2.log
    benchmark: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_2.time
    wildcards: database=new_custom_fungi_viral_db
    threads: 12

rule kraken_db_custom_1:
    output: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kraken_db/library/
    log: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kraken_db_custom_1.log
    benchmark: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kraken_db_custom_1.time
    wildcards: database=new_custom_fungi_viral_db
    threads: 12

rule kaiju_db_custom_profile:
    output: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju.dbaccession.out
    log: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_profile.log
    benchmark: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_profile.time
    wildcards: database=new_custom_fungi_viral_db

rule clark_db_custom_profile:
    output: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark.dbaccession.out
    log: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/clark_db_custom_profile.log
    benchmark: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/clark_db_custom_profile.time
    wildcards: database=new_custom_fungi_viral_db

rule clark_db_custom_1:
    output: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark_db/Custom/
    log: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/clark_db_custom_1.log
    benchmark: /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/clark_db_custom_1.time
    wildcards: database=new_custom_fungi_viral_db

grep: custom_fungi_viral_db/kaiju//*.gbff: No such file or directory
grep: custom_fungi_viral_db/clark_dudes//*.fna: No such file or directory
cat: custom_fungi_viral_db/clark_dudes//*.fna: No such file or directory
Error in job kaiju_db_custom_profile while creating output file /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju.dbaccession.out.
Error in job clark_db_custom_1 while creating output file /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark_db/Custom/.
Error in job clark_db_custom_profile while creating output file /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark.dbaccession.out.
RuleException:
CalledProcessError in line 44 of /home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kaiju_db_custom.sm:
Command 'grep -h '^VERSION' custom_fungi_viral_db/kaiju//*.gbff | tr -s ' ' | cut -d ' ' -f 2 > /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju.dbaccession.out 2> /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_profile.log' returned non-zero exit status 2
  File "/home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kaiju_db_custom.sm", line 44, in __rule_kaiju_db_custom_profile
  File "/home/stephan/work/miniconda/data/envs/py35/lib/python3.5/concurrent/futures/thread.py", line 55, in run
RuleException:
CalledProcessError in line 69 of /home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/clark_db_custom.sm:
Command 'grep -h '^>' custom_fungi_viral_db/clark_dudes//*.fna | grep -o '[A-Z]*_[0-9]*\.[0-9]*' | sed 's/>//g' > /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark.dbaccession.out 2> /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/clark_db_custom_profile.log' returned non-zero exit status 1
  File "/home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/clark_db_custom.sm", line 69, in __rule_clark_db_custom_profile
  File "/home/stephan/work/miniconda/data/envs/py35/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job kaiju_db_custom_profile since they might be corrupted:
/home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju.dbaccession.out
RuleException:
CalledProcessError in line 7 of /home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/clark_db_custom.sm:
Command '
		# Separate one file per sequence
		mkdir -p /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark_db/Custom/
		cat custom_fungi_viral_db/clark_dudes//*.fna | awk -v sep_seq_folder='/home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark_db/Custom/' '{if (substr($0, 1, 1)==">") {filename=(sep_seq_folder "/" substr($1,2) ".fna")}; print $0 > filename}'
		' returned non-zero exit status 1
  File "/home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/clark_db_custom.sm", line 7, in __rule_clark_db_custom_1
  File "/home/stephan/work/miniconda/data/envs/py35/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job clark_db_custom_profile since they might be corrupted:
/home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark.dbaccession.out
Removing output files of failed job clark_db_custom_1 since they might be corrupted:
/home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/clark_db/Custom/
Job failed, going on with independent jobs.
Job failed, going on with independent jobs.
Job failed, going on with independent jobs.
Error in job kaiju_db_custom_2 while creating output files /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.bwt, /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.sa.
RuleException:
CalledProcessError in line 17 of /home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kaiju_db_custom.sm:
Command 'mkbwt -n 5 -a ACDEFGHIKLMNPQRSTVWY -nThreads 12 -o /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.faa > /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_2.log 2>&1' returned non-zero exit status 5
  File "/home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kaiju_db_custom.sm", line 17, in __rule_kaiju_db_custom_2
  File "/home/stephan/work/miniconda/data/envs/py35/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Job failed, going on with independent jobs.
ls: cannot access custom_fungi_viral_db/kraken//*.fna: No such file or directory
Error in job kraken_db_custom_1 while creating output file /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kraken_db/library/.
RuleException:
CalledProcessError in line 8 of /home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kraken_db_custom.sm:
Command 'ls -t custom_fungi_viral_db/kraken//*.fna | xargs --max-procs=12 -I '{}' kraken-build --db /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kraken_db/ --add-to-library '{}' >> /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kraken_db_custom_1.log 2>&1' returned non-zero exit status 2
  File "/home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kraken_db_custom.sm", line 8, in __rule_kraken_db_custom_1
  File "/home/stephan/work/miniconda/data/envs/py35/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job kraken_db_custom_1 since they might be corrupted:
/home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kraken_db/library/
Job failed, going on with independent jobs.
Exiting because a job execution failed. Look above for error message

---------------------------------------------------------------------------------------
MetaMeta Pipeline v1.1 by Vitor C. Piro ([email protected], http://github.com/pirovc)
---------------------------------------------------------------------------------------
Parameters:
 - archaea_bacteria: {'clark': 'https://zenodo.org/record/820055/files/clark_bac_arc_v1.tar.gz',
 'dudes': 'https://zenodo.org/record/820053/files/dudes_bac_arc_v1.tar.gz',
 'gottcha': 'https://zenodo.org/record/819341/files/gottcha_bac_arc_v1.tar.gz',
 'kaiju': 'https://zenodo.org/record/819425/files/kaiju_bac_arc_v1.tar.gz',
 'kraken': 'https://zenodo.org/record/819363/files/kraken_bac_arc_v1.tar.gz',
 'motus': 'https://zenodo.org/record/819365/files/motus_bac_arc_v1.tar.gz'}
 - bins: 4
 - cutoff: 0.0001
 - databases: ['new_custom_fungi_viral_db']
 - dbdir: '/home/stephan/work/epityp/mirnaseq/05_metameta/databases/'
 - desiredminlen: 70
 - detailed: 0
 - errorcorr: 0
 - gzipped: 1
 - keepfiles: 1
 - mode: 'linear'
 - new_custom_fungi_viral_db: {'clark': 'custom_fungi_viral_db/clark_dudes/',
 'dudes': 'custom_fungi_viral_db/clark_dudes/',
 'kaiju': 'custom_fungi_viral_db/kaiju/',
 'kraken': 'custom_fungi_viral_db/kraken/'}
 - ranks: 'species'
 - replacement: 0
 - samples: {'sample_name_1': {'fq1': '/home/stephan/work/epityp/mirnaseq/05_metameta/input/not_mapped.fastq.gz'}}
 - samplesize: 1
 - strictness: 0.8
 - subsample: 0
 - threads: 12
 - tool_alt_path: {'bowtie2': '',
 'clark': '',
 'dudes': '',
 'gottcha': '',
 'kaiju': '',
 'kraken': '',
 'krona': '',
 'metametamerge': '',
 'motus': '',
 'spades': '',
 'trimmomatic': ''}
 - tools: {'clark': 'b', 'gottcha': 'p', 'kaiju': 'b', 'kraken': 'b', 'motus': 'p'}
 - trimming: 0
 - verbose: 0
 - workdir: '/home/stephan/work/epityp/mirnaseq/05_metameta/output/'
---------------------------------------------------------------------------------------

yaml

/home/stephan/work/epityp/mirnaseq/05_metameta/epitype_test.yaml                                                                                                                                                   1361/1361              100%
workdir: "/home/stephan/work/epityp/mirnaseq/05_metameta/output/"

## Database output directory (Tip: create this folder in a common directory so it could be used for other runs as well as other users)
dbdir: "/home/stephan/work/epityp/mirnaseq/05_metameta/databases/"

## Sample (name and files)
samples:
  "sample_name_1":
    fq1: "/home/stephan/work/epityp/mirnaseq/05_metameta/input/not_mapped.fastq.gz"
    ## Add more samples here


## Custom database

databases:
  - "new_custom_fungi_viral_db"


"new_custom_fungi_viral_db":
    clark:  "custom_fungi_viral_db/clark_dudes/"
    dudes:  "custom_fungi_viral_db/clark_dudes/"
    kaiju:  "custom_fungi_viral_db/kaiju/"
    kraken: "custom_fungi_viral_db/kraken/"


################################################################

## Configured tools (p=profiling, b=binning) from tools folder (tool.sm and tool_db.sm)
tools:
    "clark": "b"
#    "dudes": "p"
    "gottcha": "p"
    "kaiju": "b"
    "kraken": "b"
    "motus": "p"


### MetaMeta Pipeline ###
## Number of threads for each tool (distributed among the number of cores defined by main parameter --cores)
threads: 12

## Gzipped input files (0: not gzipped / 1: gzipped). Default: 0
gzipped: 1

## Keep intermediate files (database, reads and output) (0: do not keep files / 1: keep all files). Default: 0
keepfiles: 1 ### TODO change to 0

Thanks,
Stephan

Pipeline fails with "Exiting because a job execution failed. Look above for error message"

Hi,

I could successfully start the pipeline but it's failing right now with the following error:

Creating conda environment for /home/stephan/work/miniconda/data/envs/py35/opt/metameta/scripts/../envs/metametamerge.yaml...
Environment for /home/stephan/work/miniconda/data/envs/py35/opt/metameta/scripts/../envs/metametamerge.yaml created.
Error in job metametamerge while creating output file sample_name_1/metametamerge/archaea_bacteria/final.metametamerge.profile.out.
RuleException:
CalledProcessError in line 31 of /home/stephan/work/miniconda/data/envs/py35/opt/metameta/scripts/metametamerge.sm:
Command 'MetaMetaMerge.py --input-files sample_name_1/profiles/archaea_bacteria/kraken.profile.out sample_name_1/profiles/archaea_bacteria/clark.profile.out sample_name_1/profiles/archaea_bacteria/motus.profile.ou
t sample_name_1/profiles/archaea_bacteria/kaiju.profile.out sample_name_1/profiles/archaea_bacteria/gottcha.profile.out --database-profiles /home/stephan/work/mirnaseq/05_metameta/databases/archaea_bacteria
/kraken.dbprofile.out /home/stephan/work/mirnaseq/05_metameta/databases/archaea_bacteria/clark.dbprofile.out /home/stephan/work/mirnaseq/05_metameta/databases/archaea_bacteria/motus.dbprofile.out /ho
me/stephan/work/mirnaseq/05_metameta/databases/archaea_bacteria/kaiju.dbprofile.out /home/stephan/work/mirnaseq/05_metameta/databases/archaea_bacteria/gottcha.dbprofile.out --tool-identifier 'kraken,
clark,motus,kaiju,gottcha' --tool-method 'b,b,p,b,p' --names-file /home/stephan/work/mirnaseq/05_metameta/databases/names.dmp --nodes-file /home/stephan/work/mirnaseq/05_metameta/databases/nodes.dmp
--merged-file /home/stephan/work/mirnaseq/05_metameta/databases/merged.dmp --bins 3 --cutoff 0.0001 --mode 'linear' --ranks 'species' --output-file sample_name_1/metametamerge/archaea_bacteria/final.metamet
amerge.profile.out   --output-parsed-profiles > sample_name_1/log/archaea_bacteria/metametamerge.log 2>&1' returned non-zero exit status 1
  File "/home/stephan/work/miniconda/data/envs/py35/opt/metameta/scripts/metametamerge.sm", line 31, in __rule_metametamerge
  File "/home/stephan/work/miniconda/data/envs/py35/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Job failed, going on with independent jobs.
Exiting because a job execution failed. Look above for error message

Do you know how to fix it?

Thanks,
Stephan

pirovc / metameta Goto Github PK

metameta's People

Stargazers

Watchers

Forkers

metameta's Issues

MetaMeta Pipeline v1.2.0 by Vitor C. Piro ([email protected], http://github.com/pirovc)

In the final final.metametamerge.profile.out.detailed file. What does the value -1 mean? I understand 0 is absent and between 0 and 1 are the abundance obtained using a specific tool.

Recommend Projects

Recommend Topics

Recommend Org