jiwoongbio / fmap Goto Github PK

Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies

License: Other

Perl 87.02% Shell 12.98%

fmap's Issues

Generate pathway maps without comparison

Hi,
I find the tool promising, but I was wondering if it is possible to run the analysis without comparing two groups? For example if I have samples from unique environments, and I want to explore what pathways that are present in a single sample.
Thanks in advanced
Erik

32 bit usearch memory limit error when using usearch in the FMP_download.pl

Hi,
As I'm trying to run FMP_download.pl, I added usearch path. (Because I use a Mac, I can't use diamond).

I was wondering is it better to build usearch db for blast at this step or I can simply skip? Because when I use the 32 bit usearch (free of charge), I got the following error:

`2020-01-03 13:30:17 URL:https://qbrc.swmed.edu/FMAP/FMAP_data/database [46/46] -> "/Users/daia1/MSK/work/projects/FMAP/software/FMAP/FMAP_data/database" [1]
2020-01-03 13:41:10 URL:https://qbrc.swmed.edu/FMAP/FMAP_data/orthology_uniref90_2_2157_4751.20190806012959.fasta.gz [992986879/992986879] -> "/Users/daia1/MSK/work/projects/FMAP/software/FMAP/FMAP_data/orthology_uniref90_2_2157_4751.20190806012959.fasta.gz" [1]
usearch v11.0.667_i86osx32, 4.0Gb RAM (34.4Gb total), 16 cores
(C) Copyright 2013-18 Robert C. Edgar, all rights reserved.
https://drive5.com/usearch

License: [email protected]

00:10 2.3Gb 100.0% Reading /Users/daia1/MSK/work/projects/FMAP/software/FMAP/FMAP_data/orthology_uniref90_2_2157_4751.20190806012959.fasta
00:27 2.2Gb 100.0% Masking (fastamino)
01:02 2.2Gb 100.0% Word stats
usearch11.0.667_i86osx32(33884,0x420080) malloc: can't allocate region
*** mach_vm_map(size=268337152) failed (error code=3)
usearch11.0.667_i86osx32(33884,0x420080) malloc: *** set a breakpoint in malloc_error_break to debug

/Users/daia1/projects/FMAP/software/usearch11.0.667_i86osx32 -makeudb_ublast /Users/daia1/MSK/work/projects/FMAP/software/FMAP/FMAP_data/orthology_uniref90_2_2157_4751.20190806012959.fasta -output /Users/daia1/MSK/work/projects/FMAP/software/FMAP/FMAP_data/orthology_uniref90_2_2157_4751.20190806012959.udb

---Fatal error---
Memory limit of 32-bit process exceeded, 64-bit build required`

Thank you and happy new year.

Angel

Error in FMAP_plot.pl

Hello,

I am running into issues when using the FMAP_plot.pl script.

When I run: perl /usr/local/FMAP/FMAP_plot.pl FMAP_bulksw.vs.ecosphere/bulk_vs_coralcubaonly.comparison.pathways.txt > bulk_vs_coralcubaonly.comparison.pathways.pdf

I get the error warning:
ERROR in /usr/local/FMAP/FMAP_plot.pl: Use of uninitialized value $_ in pattern match (m//) at /usr/local/FMAP/FMAP_plot.pl line 38.

Any suggestions?

Heatmap of KOs used in pathway analysis

Hello!

I would like to make a heatmap of the abundances of KOs that are linked to the significantly enriched pathways across all of my samples. Is there a way to do this using FMAP? For example, if the photosynthesis pathway is enriched across a sample group, would it be possible to link the pathway back to the KO's used to construct that pathway and then construct a data matrix of those abundances?

Thanks for your time!

DIAMOND as no option --query

Greetings

I have been trying to use your tool, and a problem arose in the FMAP_assembly.pl part of the workflow.

diamond: error: no such option: --query
ERROR in FMAP_assembly.pl: readline() on closed filehandle $reader at FMAP_assembly.pl line 283.

I compiled FMAP as it is suggested, and DIAMOND was installed as follows:

wget http://github.com/bbuchfink/diamond/releases/download/v0.9.9/diamond-linux64.tar.gz
tar xzf diamond-linux64.tar.gz
sudo apt-get install python-configobj

Since I have used DIAMOND before without such errors, I have no idea what is going on here. I did associated FMAP with the DIAMOND binary that so far gave me no errors.

Thank you for your attention

Comparison usage

Hello,
Thank you for the excellent pipeline tool.

I have a question about the FMAP_comparison.pl. I have 4 samples in duplicate, how i can run the comparison step (sample group information) ?

Thanks.

Failed to create thread

Hi any idea why it´s not creating the multithreads ??

echo "perl /home/da/work/FMAP/FMAP_mapping.pl allgenes.fa -p 10 -t tmp > outfile" | qsub -pe parallel_smp 10 -e err -o out

==> err <==
[3.08769s]
Building query histograms... [0.144188s]
Error: Failed to create thread.

Getting error at FMAP_quantification.pl

Hi there,
I am testing out your program with a single sample. The FMAP_assembly work flow worked. However I am trying the FMAP_mapping work flow and get an error when parsing the output of FMAP_mapping.pl to FMAP_quantification.pl

$ FMAP_quantification.pl blastx_hits.txt > CS1BS_mapping_abundance.txt

ERROR in /home/joseph/biotools/programs/FMAP-0.10/FMAP_quantification.pl: Use of uninitialized value $percentIdentity in numeric lt (<) at /home/joseph/biotools/programs/FMAP-0.10/FMAP_quantification.pl line 86, <$reader> line 1.

perl version is v5.22.1 on ubuntu server.

First 10 lines of my blastx_hits.txt

M00566:77:000000000-ATBW8:1:2106:7118:2648 K06911_UniRef90_E8YRU9 50.0 70 35 0 211 2 28 97 5.5e-11 70.9
M00566:77:000000000-ATBW8:1:2106:14488:2648 K08641_UniRef90_A0A1D7QDD3 50.7 71 35 0 1 213 158 228 1.1e-17 93.6
M00566:77:000000000-ATBW8:1:2106:9279:2649 K02037_UniRef90_A0A0D5NE94 75.3 97 24 0 291 1 186 282 1.4e-33 146.4
M00566:77:000000000-ATBW8:1:2106:8751:2649 K13688_UniRef90_D3P6W0 51.8 112 40 1 1 294 296 407 5.6e-19 97.8
M00566:77:000000000-ATBW8:1:2106:9476:2650 K00784_UniRef90_D2BNS6 43.5 62 35 0 188 3 8 69 3.8e-07 58.5
M00566:77:000000000-ATBW8:1:2106:16489:2650 K01961_UniRef90_F8I902 76.9 52 12 0 24 179 1 52 2.9e-15 84.7
M00566:77:000000000-ATBW8:1:2106:14734:2650 K02660_UniRef90_K9THH3 41.8 98 55 2 300 10 665 761 1.0e-07 60.5
M00566:77:000000000-ATBW8:1:2106:14734:2650 K02660_UniRef90_K9THH3 37.9 58 33 1 198 25 576 630 5.4e-01 38.1
M00566:77:000000000-ATBW8:1:2106:7969:2650 K03088_UniRef90_C1AB78 57.3 89 38 0 6 272 14 102 5.9e-23 110.9

FMAP_module.pl problem

Dear sir:
I am doing a functional comparison for two group of samples. I used ngless to do the KEGG based profiling.
And I tried to use FMAP for my function comparison.
I am having a problem for my FMAP_module.pl comand.
It complained that my fisher test failed because of such error.

Problem while running this R command:
p.value <- fisher.test(matrix(c(18,13,4776,-2406), 2), alternative = "greater")$p.value

Error:
fisher.test(matrix(c(18, 13, 4776, -2406), 2), alternative = "greater") :
all entries of 'x' must be nonnegative and finite
Execution halted

However, the same script FMAP_module.pl works fine for the example dataset.
I tried to modify the script to print something.
my $test= $orthologyCount - $moduleOrthologyCount - $targetOrthologyCount + $moduleTargetOrthologyCount;
print $orthologyCount,",",$moduleOrthologyCount,",",$targetOrthologyCount,",",$moduleTargetOrthologyCount,",",$test,"\n";

the result looks like this.
2401,31,4794,18,-2406
2401,12,4794,7,-2398
2401,20,4794,13,-2400
2401,15,4794,10,-2398
2401,1,4794,1,-2393
2401,6,4794,4,-2395
2401,6,4794,4,-2395
2401,5,4794,5,-2393
2401,40,4794,27,-2406
2401,6,4794,5,-2394
2401,34,4794,22,-2405
2401,9,4794,7,-2395
2401,9,4794,6,-2396
2401,12,4794,6,-2399
2401,4,4794,3,-2394
2401,14,4794,9,-2398

This made me to think about the kegg orthology I used. They came from the result of gene profiling by eggnog.
I noticed that many Ks were not in your KEGG database. (You seem to use KEGG_orthology.txt for profiling?? )
I think the variable represent the meanings. Please kindly explain what are their meaning and how you calculated them
They are all based on orthology in KEGG_orthology2module.txt ???
$orthologyCount ---- total orthology count ???
$moduleOrthologyCount ---- total orthologyCount in this module ???
$targetOrthologyCount ---- orthologyCount passed ???
$moduleTargetOrthologyCount ---- orthologyCount of passed in this module ???

Please kindly suggest,

my1.comparison.txt

Interpretation of FMAP pathway plot

Hi,
I am working on a dataset consisting of normal and crohn's disease patients sample. After mapping KO genes to pathways I am getting a plot of pathways, but how to determine those differentially abundant pathways are from which sample?

Abundances

Hi,
When computing abundances i see a strange behaviour (see exemple below)
Each file processed individually and then grouped.

I would expect the sum of the two files should be 11+7=18 genes and it returns 17 genes ??
Not sure if there is an extra filter added beyond the 80% ident.

perl FMAP_quantification.pl 36.uniprot.blast.txt  | grep K00001
K00001  E1.1.1.1, adh; alcohol dehydrogenase [EC:1.1.1.1]       7     997.386527521675

perl FMAP_quantification.pl 35.uniprot.blast.txt  | grep K00001
K00001  E1.1.1.1, adh; alcohol dehydrogenase [EC:1.1.1.1]       11      416.346777952545

perl FMAP_quantification.pl 35.uniprot.blast.txt  36.uniprot.blast.txt | grep K00001
K00001  E1.1.1.1, adh; alcohol dehydrogenase [EC:1.1.1.1]       17      537.487924074303

RPK calculation

Hi,
Really like FMAP, starting to learn how to use it. Just wondered how you calculated RPK - I saw your code says readRpk = 1 / ($proteinLength * 3 / 1000). Wondered why you multiplied proteinlength by 3. Shouldnt it just be readRpk = 1 / ($proteinLength / 1000)?

Also would there be a update that builds a database on the entire Uniref90 for mapping and quantification rather than just the IDs that have KO. I would like to use it for gene enrichment analysis using GO.

Thanks,
Nabil

Cannot Prepare Databases

Running perl $FMAP_DIR/FMAP_prepare.pl -k results in: ERROR in FMAP_prepare.pl: Use of uninitialized value $line in chomp at FMAP_prepare.pl line 186.

ERROR in FMAP_quantification.pl

I completed the first step and got the output file (file1).
But, when I run perl FMAP_quantification.pl [file1] > [file2], it shows ERROR in FMAP_quantification.pl: Use of uninitialized value $orthology in hash element at FMAP_quantification.pl line 71, <$reader> line 1773. What should do to fix it.

Running FMAP_all.pl on Linux -" File is not available

I tried to run the FMAP_all.pl script on multiple datasets (the example dataset among them) and got the following error message every time at the mapping step:

Error: Error calling stat on file [filepath]
" is not available.a/FMAP/FMAP_mapping.pl: The input "[file]

After checking the .sh file created by the program, the problem seems to be that there is an "^M" ending after each filename.
Converting the .config and .sh files with the dos2unix tool solved the problem for me.

FMAP_comparison.pl error - what is right abundance.txt format?

Hi @jiwoongbio

I'm trying to use FMAP only for the final steps of comparison and statistical testing, so I need to create the right input format for FMAP_comparison.pl

My table head looks like this:

But when I try to run the command:
perl FMAP_comparison.pl abundance.txt SI3U_2017,SI3L_2017,SB_12_2017,SB_02_2017,SB_02_2018 CB4_2018,CBIA_2018,CB1_2018,CBIW_2018,CBIW_02_2017,CBIW_12_2017 > orthology_test_stat.txt

It returns:
ERROR in FMAP_comparison.pl: The sample "CBIW_12_2017" is not on the table.

I'm guessing there is something wrong with my format, but it looks exactly like the example files.. Thanks for the help!

Download error

Tried to run 'perl FMAP_download.pl' and got the following error:

ERROR: cannot verify qbrc.swmed.edu's certificate, issued by '/C=US/ST=MI/L=Ann Arbor/O=Internet2/OU=InCommon/CN=InCommon RSA Server CA':
Issued certificate has expired.
To connect to qbrc.swmed.edu insecurely, use `--no-check-certificate'.
ERROR in FMAP_download.pl: Use of uninitialized value $line in chomp at FMAP_download.pl line 127.

Seems like a certificate needs to be updated.

Thanks

Conda Install

Any chance of making a conda install for FMAP. So many of us use this installation approach it would be a real help.

Getting direction (log2foldchange) of significantly different pathways and operons

Hello!

I was wondering if there was a way to obtain the direction (log2foldchange value) of the significantly different enriched pathways that are revealed from the FMAP_pathway.pl command. Is there a way to do this using the FMAP_module.pl command to?

Thanks!

Dealing with Paired End Files

Is there any method in place to do the FMAP_mapping.pl step for paired end fastq files? Is giving both the files as parameters for the perl script sufficient or is there any other way to go about this?

jiwoongbio / fmap Goto Github PK

fmap's Issues

Recommend Projects

Recommend Topics

Recommend Org