bodeolukolu / qmatey Goto Github PK
View Code? Open in Web Editor NEWQuantitative metagenomic alignment and taxonomic exact-matching
Quantitative metagenomic alignment and taxonomic exact-matching
Hello,
I want to run Qmatey to classify NGS metagenomic data through the custom database. I have tried many times but failed to run successfully.
I have some files in project directory.
- Config.sh
- Reference_genomes.fa: all genomes merged to a .fa file
- Paired-end reads: sample_R1.fq, sample_R2.fq
- Seqid2taxid.map: sequence in Reference_genomes.fa mapping to species taxid
Here is my config.sh
#General_parameters
####################################################
threads=24
cluster=false
samples_alt_dir=false
library_type=WGS
HDsubsample=false
subsample_shotgun_R1=true
subsample_shotgun_R2=true
shotgun_min_read_length=100
#simulation_parameters
####################################################
simulation_lib=complete_digest
simulation_motif_R1=ATGCAT
simulation_motif_R2=CATG
fragment_size_range=64,600
max_read_length=150
gcov=3
#Normalization
####################################################
normalization=true
#MegaBLAST
####################################################
blast_location=custom
local_db=
#local_db=/media/sdd/ncbi_db/nt/nr
#local_db=/media/sdb/ncbi_db/16S/16S_ribosomal_RNA,/media/sdb/ncbi_db/18S/18S_fungal_sequences,/media/sdb/ncbi_db/28S/28S_fungal_sequences,/media/sdb/ncbi_db/ITS/ITS_eukaryote_sequences
#local_db=/media/sdd/ncbi_db/refseq/refseq_rna,/media/sdd/ncbi_db/refseq/ref_viroids_rep_genomes,/media/sdd/ncbi_db/refseq/ref_prok_rep_genomes,/media/sdd/ncbi_db/refseq/ref_euk_rep_genomes
taxids=true
input_dbfasta=/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/db/reference_genomes.fa
map_taxids=/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/taxids/seqid2taxid.map
#Taxonomic_Profiling_and_Filtering
####################################################
taxonomic_level=species
spearman_corr=species
CCLasso_corr=
min_percent_sample=10,20
min_pos_corr=0.1,0.2,0.3
max_neg_corr=0.1,0.2,0.3
#Visualizations
####################################################
sunburst_taxlevel=species
sunburst_nlayers=species
#Advanced_Parameters
####################################################
nodes=1
minRD=0
fullqlen_alignment=false
reads_per_megablast=10000
reads_per_megablast_burn_in=10000
zero_inflated=0.01
exclude_rRNA=false
annotate_seq=false
However, when I ran Qmatey, I got the error.
- only paired-end reads available in pe-folder
ls: cannot access '*_R2.fasta.gz': No such file or directory
ls: cannot access '*.R2.fasta.gz': No such file or directory
cat: sample.fq.tmp1.txt: No such file or directory
rm: cannot remove 'sample.fq.tmp1.txt': No such file or directory
�[31m normalization reference folder is empty, Qmatey will not exclude any read
�[31m Qmatey will use read coverage of samples for normalization
- calculating a normalization factor
- compile metagenome reads & compute relative read depth �[97m
cat: '/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/taxids/*.txids': No such file or directory
- performing custom BLAST
USAGE
blastn [-h] [-help] [-import_search_strategy filename]
[-export_search_strategy filename] [-task task_name] [-db database_name]
[-dbsize num_letters] [-gilist filename] [-seqidlist filename]
[-negative_gilist filename] [-negative_seqidlist filename]
[-taxids taxids] [-negative_taxids taxids] [-taxidlist filename]
[-negative_taxidlist filename] [-entrez_query entrez_query]
[-db_soft_mask filtering_algorithm] [-db_hard_mask filtering_algorithm]
[-subject subject_input_file] [-subject_loc range] [-query input_file]
[-out output_file] [-evalue evalue] [-word_size int_value]
[-gapopen open_penalty] [-gapextend extend_penalty]
[-perc_identity float_value] [-qcov_hsp_perc float_value]
[-max_hsps int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value]
[-xdrop_gap_final float_value] [-searchsp int_value] [-penalty penalty]
[-reward reward] [-no_greedy] [-min_raw_gapped_score int_value]
[-template_type type] [-template_length int_value] [-dust DUST_options]
[-filtering_db filtering_database]
[-window_masker_taxid window_masker_taxid]
[-window_masker_db window_masker_db] [-soft_masking soft_masking]
[-ungapped] [-culling_limit int_value] [-best_hit_overhang float_value]
[-best_hit_score_edge float_value] [-subject_besthit]
[-window_size int_value] [-off_diagonal_range int_value]
[-use_index boolean] [-index_name string] [-lcase_masking]
[-query_loc range] [-strand strand] [-parse_deflines] [-outfmt format]
[-show_gis] [-num_descriptions int_value] [-num_alignments int_value]
[-line_length line_length] [-html] [-sorthits sort_hits]
[-sorthsps sort_hsps] [-max_target_seqs num_sequences]
[-num_threads int_value] [-mt_mode int_value] [-remote] [-version]
DESCRIPTION
Nucleotide-Nucleotide BLAST 2.14.0+
Use '-help' to print detailed descriptions of command line arguments
========================================================================
Error: Too many positional arguments (1), the offending value: wait
Error: (CArgException::eSynopsis) Too many positional arguments (1), the offending value: wait
gzip: subfileF1_1_out.blast: No such file or directory
cat: '*_out.blast.gz': No such file or directory
ls: cannot access '*subfile*': No such file or directory
gzip: /home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/metagenome/alignment/combined_compressed.megablast.gz: unexpected end of file
gzip: /home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/metagenome/alignment/combined_compressed.megablast.gz: unexpected end of file
awk: fatal: cannot open file `/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/taxids/*.txids' for reading (No such file or directory)
awk: fatal: cannot open file `/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/taxids/*.txids' for reading (No such file or directory)
- performing cross-taxon validation and filtering at class level
/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/Qmatey_run.sh: line 6110: cd: /home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/metagenome/results/class_level: No such file or directory
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file '../phylum_level/phylum_taxainfo_unique_sequences.txt': No such file or directory
Execution halted
- performing cross-taxon validation and filtering at order level
/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/Qmatey_run.sh: line 6117: cd: /home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/metagenome/results/order_level: No such file or directory
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file '../class_level_validated/class_taxainfo_unique_sequences.txt': No such file or directory
Execution halted
- performing cross-taxon validation and filtering at family level
/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/Qmatey_run.sh: line 6124: cd: /home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/metagenome/results/family_level: No such file or directory
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file '../order_level_validated/order_taxainfo_unique_sequences.txt': No such file or directory
Execution halted
- performing cross-taxon validation and filtering at genus level
/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/Qmatey_run.sh: line 6131: cd: /home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/metagenome/results/genus_level: No such file or directory
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file '../family_level_validated/family_taxainfo_unique_sequences.txt': No such file or directory
Execution halted
- performing cross-taxon validation and filtering at species level
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file '../genus_level_validated/genus_taxainfo_unique_sequences.txt': No such file or directory
Execution halted
- performing cross-taxon validation and filtering at strain level (mininimum unique seqeuence = 1)
/home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/Qmatey_run.sh: line 6145: cd: /home/work/wenhai/metaprofiling/bacteria_refgenome_NCBIdata/pggb_vg/big_sample/alternative_methods/Qmatey/Qmatey/metagenome/results/strain_level_minUniq_1: No such file or directory
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file '../species_level_validated/species_taxainfo_unique_sequences.txt': No such file or directory
Execution halted
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file './phylum_level/phylum_taxainfo_unique_sequences.txt': No such file or directory
Execution halted
�[97m########################################################
�[38;5;210mQmatey is creating sunburst
�[97m########################################################
- creating sunburst from species_level taxonomic profile
Except for a few large files, all result files are provided here Qmatey.tar.gz. Can you help me take a look and give some suggestions, especially about the parameter settings of the config.sh file?
Thank you in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.