Giter VIP home page Giter VIP logo

Comments (4)

cfrioux avatar cfrioux commented on May 27, 2024

Hi Adifo,

Errors with Pathway Tools occur quite often, we do our best to prevent them or help troubleshooting them but it is not always easy. It could also be an error from our side through the mpwt dependency.

Could you please share with us some logs that you might have? Cheers

from metage2metabo.

Adifo avatar Adifo commented on May 27, 2024

Hi Clémence,

I've overwritten my logs leading to the File Error 17 but I will try to recreate the circumstances.
Now that I've figured out much of the issues regarding prokka output and the new gbff format, the types of error I had were lower and I manage to obtain all my sbml in two subsequent runs.

First, here is the log of the first run leading to API limitations error. Input dir gbk contains 404 directories with corresponding .gbk/.gbff file

Command:
m2m workflow -g gbk -s seeds_workflow.sbml -o reconTestOutput -c 10 --clean

m2m_workflow.log:
start_workflow.log

Directories count:

echo -ne "Input\t"; find ./gbk/ -name 'pathologic.log' | wc -l; echo -ne "Local\t"; find ~/ptools-local/pgdbs/ -name 'gc*' -type d | wc -l; echo -ne "Output\t"; find reconTestOutput/pgdb/ -name 'GC*' -type d | wc -l
Input	404
Local	404
Output	397

Second, rerun to complete the failing ones

Command:
m2m workflow -g gbk -s seeds_workflow.sbml -o reconTestOutput -c 10

m2m_workflow.log:
second.log

Directories count:

echo -ne "Input\t"; find ./gbk/ -name 'pathologic.log' | wc -l; echo -ne "Local\t"; find ~/ptools-local/pgdbs/ -name 'gc*' -type d | wc -l; echo -ne "Output\t"; find reconTestOutput/pgdb/ -name 'GC*' -type d | wc -l
Input	379
Local	404
Output	404

In this case, I managed to finish the workflow so I'm fine with the result. Having fixed the content of the gbk before hand probably helped (like Locus line format error, missing db_xref entries or erroneous taxon_id). But, I've lost some PathoLogic logs in my input dir and mpwt still displayed several errors trying to rerun on genome that were tagged as already present.

From what I remember of m2m code (reconstruction.py), the full list of input genome is compared to PGDB content. The mpwt process is skipped only if the two sets match fully. If it is incomplete, the input dir goes to mpwt without removing the ones already processed. Maybe adding a skip list to the multiprocess_pwt function could help.

from metage2metabo.

ArnaudBelcour avatar ArnaudBelcour commented on May 27, 2024

Hi @Adifo,

As you already find a solution, I will just comment on what was the error.

During the inference by Pathway Tools, there is a step that will load NCBI citation. And with the multiprocessing sometimes there is too much queries send by Pathway Tools to the NCBI server(as we have X Pathway Tools processes that can query the NCBI, X being the number of CPU given with the option -c).

And it can lead to the following error:

    Fatal error: XML not well-formed - unrecognized content '{"error":"API rate limit exceeded","api-'

There is an option to avoid loading the NCBI citation (in the ptools-init.dat file located inside the ptools-local folder) named ###Batch-PathoLogic-Download-Pubmed-Entries? but I never got it to work.

For your second issue, indeed Metage2Metabo will only see if the output folder contains all the PGDB of the associated organisms. For all other cases, it is handled by mpwt. But as I have modified the behavior of mpwt with the version 0.7.0, it is possible that there is some issues.

I will try to investigate this when I have some times. Just to be sure, can you send me your version of mpwt to check if it is superior or equal to 0.7.0?

from metage2metabo.

Adifo avatar Adifo commented on May 27, 2024

Thanks for the insight on the API error. I will keep that in mind when defining the number of threads next time.

Regarding mpwt, I've used mpwt version 0.7.1.

Here is the full list of software from my environment and from the subsequent pip.

From my perspective, you can close this issue but feel free to leave it open and reach out to me for feedback.

micromamba list
List of packages in environment: "/home/adf/micromamba/envs/m2m"

  Name              Version    Build               Channel    
────────────────────────────────────────────────────────────────
  _libgcc_mutex     0.1        conda_forge         conda-forge
  _openmp_mutex     4.5        1_gnu               conda-forge
  ca-certificates   2021.10.8  ha878542_0          conda-forge
  ld_impl_linux-64  2.36.1     hea4e1c9_2          conda-forge
  libffi            3.4.2      h7f98852_5          conda-forge
  libgcc-ng         11.2.0     h1d223b6_14         conda-forge
  libgomp           11.2.0     h1d223b6_14         conda-forge
  libnsl            2.0.0      h7f98852_0          conda-forge
  libstdcxx-ng      11.2.0     he4da1e4_14         conda-forge
  libzlib           1.2.11     h166bdaf_1014       conda-forge
  ncurses           6.3        h9c3ff4c_0          conda-forge
  openssl           1.1.1n     h166bdaf_0          conda-forge
  pip               21.3.1     pyhd8ed1ab_0        conda-forge
  python            3.6.15     hb7a2778_0_cpython  conda-forge
  python_abi        3.6        2_cp36m             conda-forge
  readline          8.1        h46c0cb4_0          conda-forge
  setuptools        58.0.4     py36h5fab9bb_2      conda-forge
  sqlite            3.37.1     h4ff8645_0          conda-forge
  tk                8.6.12     h27826a3_0          conda-forge
  wheel             0.37.1     pyhd8ed1ab_0        conda-forge
  xz                5.2.5      h516909a_1          conda-forge
  zlib              1.2.11     h166bdaf_1014       conda-forge
pip list
Package             Version
------------------- ---------
anyio               3.5.0
appdirs             1.4.4
argcomplete         2.0.0
argh                0.26.2
Arpeggio            2.0.0
async-generator     1.10
attrs               21.4.0
biopython           1.79
bubbletools         0.6.11
certifi             2021.10.8
cffi                1.15.0
chardet             4.0.0
charset-normalizer  2.0.12
clingo              5.5.1
clyngor             0.4.2
clyngor-with-clingo 5.3.post1
cobra               0.24.0
commonmark          0.9.1
contextvars         2.4
dataclasses         0.8
decorator           4.4.2
depinfo             1.7.0
diskcache           5.4.0
docopt              0.6.2
ete3                3.1.1
future              0.18.2
gffutils            0.10.1
graphviz            0.19.1
h11                 0.12.0
httpcore            0.14.7
httpx               0.22.0
idna                3.3
immutables          0.17
importlib-metadata  4.8.3
importlib-resources 5.4.0
iniconfig           1.1.1
lxml                4.8.0
MeneTools           3.2.1
Metage2Metabo       1.5.0
Miscoto             3.1.2
mpmath              1.2.1
mpwt                0.7.1
networkx            2.5.1
numpy               1.19.5
optlang             1.5.2
packaging           21.3
padmet              5.0.1
pandas              1.1.5
phasme              0.0.16
pip                 21.3.1
pluggy              1.0.0
powergrasp          0.8.18
py                  1.11.0
pycparser           2.21
pydantic            1.9.0
pydot               1.4.2
pyfaidx             0.6.4
Pygments            2.11.2
pyparsing           3.0.7
pyPEG2              2.15.2
pytest              7.0.1
python-dateutil     2.8.2
python-libsbml      5.19.2
pytz                2022.1
rfc3986             1.5.0
rich                12.2.0
ruamel.yaml         0.17.21
ruamel.yaml.clib    0.2.6
setuptools          58.0.4
simplejson          3.17.6
six                 1.16.0
sniffio             1.2.0
swiglpk             5.0.5
sympy               1.9
tomli               1.2.3
typing_extensions   4.1.1
wheel               0.37.1
zipp                3.6.0

from metage2metabo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.