nf-core / metaboigniter Goto Github PK

Pre-processing of mass spectrometry-based metabolomics data with quantification and identification based on MS1 and MS2 data.

Home Page: https://nf-co.re/metaboigniter

License: MIT License

HTML 0.97% Python 29.01% Nextflow 66.23% Shell 3.78%

workflow metabolomics identification quantification mass-spectrometry nextflow pipeline nf-core ms1 ms2

metaboigniter's People

Contributors

Stargazers

Watchers

Forkers

metaboigniter egonw payamemami ewels maxvincent24 jordeu cmaceves janiceou metaboigniter axelwalter metaboigniter biofriends jingtinglv

metaboigniter's Issues

Software in bioconda

Generally we aim to use software packaged in BioConda for nf-core pipelines. By doing so we get support for conda, docker and singularity (the 2nd two via https://biocontainers.pro). We also avoid taking on maintenance of software packaging as well as pipeline maintenance.

Currently, this pipeline is using a suite of custom Docker containers. These are all built using dedicated repos at https://github.com/MetaboIGNITER

If possible, it would be great to switch from using these to using Bioconda. Here's my quick googling for them:

metaboigniter/container-openms:v2.4.0 - Yes: openms / openms-thirdparty
- The nf-core/mhcquant pipeline seems to use the latter
metaboigniter/container-openmstoxcms:v1.0.1 - Maybe as part of the packages above?
metaboigniter/container-xcms:v3.0.2_ipo - Yes: bioconductor-xcms
metaboigniter/container-camera:v1.33.4_xcms3.0.2 - Yes: bioconductor-camera
metaboigniter/container-msnbase:v2.0.2 - Yes: bioconductor-msnbase
metaboigniter/container-metfrag:v0.0.1 - Yes: metfrag
metaboigniter/container-cfmid:v2.0.2 - No
metaboigniter/container-csifingerid:v4.5.1_v1 - Yes: sirius-csifingerid
metaboigniter/container-passatutto:v2.1.0 - No

So nearly all seem to be available already on the face of it.

As you're currently using one container per process, the quickest way to use them is just to add them to the main script, e.g.:

process xcms {
    container "quay.io/biocontainers/bioconductor-xcms:3.12.0--r40h5f743cb_0"
    conda "bioconductor-xcms:3.12.0-0"

    script:
    """
    normal nextflow stuff here
    """
}

However, if it works, it might be nicer to add an environment.yml file back with the bioconda deps in, if they play well together. That gives a couple of advantages:

We can make the get_software_versions process run each command in one process to get the software version numbers reported
Simpler administration - nf-core lint checks this file for available updates for example
Smaller total file size for Singularity users

If they don't work together then that's fine. Pretty soon we will be moving all pipelines to DSL2 and rewriting pipelines to use a central repository of software wrappers at https://github.com/nf-core/modules - then each process will have to have its own container. If we're not using the main pipeline docker image at all we should delete the Dockerfile though and remove mention of the top-level process.container attribute.

Let me know what you think!

Phil

Do not define "NULL" string as a default value at nextflow_schema.json

Description

Some fields at nextflow_schema.json file define default values like an string ("default": "NULL") this will be a problem in the upcoming version of tower.nf. The "NULL" string will be set at the launchpad form and send to Nextflow when launching the pipeline. Finally the run will fail because Nextflow will interpret it as a string and not as an empty parameter.

Solution

Aligned with the discussion here about enforcing stricter rules for initialising params with no default value, I suggest to just set this fields to null at nextflow.config file and remove the default setting from the schema file. This will be compatible with the future tower.nf release.

C13 detection will always be performed regardless of adduct detection option

C13 detection should be switched off if the adduct detection is disabled. Or it should have its parameter.

https://github.com/nf-core/metaboigniter/blob/2f8f077d38fcacd2caef9590dc557ddcc17c78c6/subworkflows/local/annotation.nf#L30C1-L31C1

Pipeline has no release but no UNDER CONSTRUCTION warning

Check Documentation

I have checked the following places for your error:

Description of the bug

Steps to reproduce

Steps to reproduce the behaviour:

Command line:
See error:

Expected behaviour

Log files

Have you provided the following extra information/files:

The command used to run the pipeline
The .nextflow.log file

System

Hardware:
Executor:
OS:
Version

Nextflow Installation

Version:

Container engine

Engine:
version:
Image tag:

Additional context

library creation fails due to missing parameter

These parameters are missing in the argument reading section. They need to be added otherwise the user has to use exactly the same header as the provided example.

metaboigniter/bin/createLibrary.r

Lines 12 to 15 in fca2ab5

 rawFileName<-"rawFile" 

 compundID<-"HMDB.YMDB.ID" 

 compoundName<-"PRIMARY_NAME" 

 mzCol<-"mz"

Quantification clarification

We need to clarify that the missing values are represented as zero.

https://github.com/nf-core/metaboigniter/blob/2f8f077d38fcacd2caef9590dc557ddcc17c78c6/docs/output.md?plain=1#L25C4-L25C4

Process to create library using MSnbase fails on a few files

Description of the bug

In the identification subpipeline, I am trying to perform the identification using internal standards. I have a few library .mzML files and an associated library description file. The process process_create_library_pos_msnbase fails for some of the files but pass for some others

Steps to reproduce

Steps to reproduce the behaviour:

Sorry the data cannot be provided to reproduce the error :(

Command line: nextflow run metaboigniter/main.nf -c metaboigniter/conf/custom.config -profile singularity
System:

Hardware: HPC
Executor: slurm
OS: CentOS Linux
Version: 7

Nextflow Installation:

Version: 20.10.0

Container engine:

Engine: Singularity
version: 3.7.4-1.el7
Image tag: nfcore/metaboigniter:dev

Errors

Before the latest dev version, the process process_create_library_pos_msnbase failed on some of my library files, but the error was not the same for all these files, either this error (A) :

Error in strsplit(x = hitTMP[, "parentmzs"], split = ";", fixed = T)[[1]] : 
  subscript out of bounds
Calls: createLibrary
Execution halted

or this error (B) :

Error in parentMS2s[[p]] : subscript out of bounds
Calls: createLibrary
Execution halted

In the latest dev version (9c86f6f), with the modifications in the createLibrary.R file, the files which failed with error B (Error in parentMS2s[[p]] : subscript out of bounds) now pass this process, but the files which failed with error A (Error in strsplit(x = hitTMP[, "parentmzs"], split = ";", fixed = T)[[1]] : subscript out of bounds) still fail

If you have any idea on this issue it would be of great help 💪
Thanks in advance

Library search retention time tolerance is missing

Check Documentation

I have checked the following places for your error:

Description of the bug

Steps to reproduce

Steps to reproduce the behaviour:

Command line:
See error:

Expected behaviour

Log files

Have you provided the following extra information/files:

The command used to run the pipeline
The .nextflow.log file

System

Hardware:
Executor:
OS:
Version

Nextflow Installation

Version:

Container engine

Engine:
version:
Image tag:

Additional context

"Path value cannot be null" error during featurelinkerunlabeledkd step

Description of the bug

During workflow with peakpickerhires->featurefindermetabo->mapalignerposecluster->maprttransformer workflow stops after mapalignerposecluster step with error "Error executing process Caused by: Path value cannot be null"

Previously had to make adjustment to the modules.config file to get the peakpickerhires step to work (added .centroided to the filename.mzML in line 48) and had to adjust paths on lines 137 and 175 to remove ${meta.id} to fix "filename too long" error.

Command used and terminal output

command: nextflow run nf-core/metaboigniter -profile docker

output: 
WARN: Input tuple does not match input set cardinality declared by process `NFCORE_METABOIGNITER:METABOIGNITER:LINKER:OPENMS_FEATURELINKERUNLABELEDKD` -- offending value: [id:Linked_data]
ERROR ~ Error executing process > 'NFCORE_METABOIGNITER:METABOIGNITER:LINKER:OPENMS_FEATURELINKERUNLABELEDKD (1)'

Caused by:
  Path value cannot be null

Relevant files

files.zip

System information

Nextflow version: 23.10.1
Metaboigniter version: 2.0.0
Hardware: Desktop
Executor: local
Container engine: Docker
OS: Linux (Fedora 39)

Use nf-validation for samplesheet parsing

          Any plans to use nf-validation instead?

Originally posted by @maxulysse in #69 (comment)

Missing output file(s) error when centroiding data

Description of the bug

After centroiding first data file the workflow gives an error and stops, saying it can't find the centroided data file it just created.

It appears that changing line 48 in modules.config to be

ext.prefix = { " ${meta.id}.centroided " }

fixes the issue. The workflow appears to create a new centroided file with the original filename.mzML instead of filename.centroided.mzML which is what the workflow looks for in future steps

Command used and terminal output

No response

Relevant files

No response

System information

Nextflow version: 23.10.01
Metaboigniter version: 2.0.0
Hardware: Desktop
Executor: Local
Container Engine: Docker
OS: Linux (Fedora 39)

Filename too long error when aligning multiple files

Description of the bug

metaboigniter completes without error when running only a couple files, but when running a full batch (63 files) crashes at the alignment step giving a "filename too long" error when trying to create the output from the mapalign step. it looks as though it's trying to pass an array of sample names as a filename to the /alignment/ folder.

Command used and terminal output

command: nextflow run nf-core/metaboigniter -profile docker

output: Mar-01 19:13:26.604 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: /home/laytox/projects/smoke/smoke_taint/99-output/alignment/[c3r1-r001, c3r1-r002, c1r3-r003, c1r3-r001, c3r1-r003, c1r1-r002, c1r1-r003, c1r1-r001, c1r3-r002, c3r3-r001, c3r3-r002, c4r1-r001, c3r3-r003, c4r1-r002, s1r1-r001, c4r1-r003, c4r3-r002, c4r3-r003, c4r3-r001, s1r1-r002, s1r2-r001, s1r1-r003, s1r2-r002, s1r2-r003, s1r3-r001, s2r1-r001, s1r3-r002, s1r3-r003, s2r3-r001, s2r1-r002, s2r1-r003, s2r2-r001, s2r2-r003, s2r2-r002, s3r1-r003, s2r3-r002, s2r3-r003, s3r1-r001, s3r3-r001, s3r1-r002, s3r2-r001, s3r2-r002, s3r2-r003, s3r3-r002, s4br2-r001, s3r3-r003, s4br1-r001, s4br1-r002, s4br1-r003, s4br3-r003, s4br2-r002, s4br2-r003, s4br3-r001, s4br3-r002, s4fr2-r001, s4fr1-r001, s4fr3-r001, s4fr1-r002, s4fr1-r003, s4fr2-r002, s4fr2-r003, s4fr3-r002, s4fr3-r003]: File name too long

Relevant files

files.zip

System information

Nextflow version: 23.10.1
Metaboigniter version: 2.0.0
Hardware: desktop
Executor: local
container engine: docker
OS: Linux (Fedora 39)

Process to create library using MSnbase fails

Description of the bug

raw_file_name_preparelibrary_pos_msnbase
compund_id_preparelibrary_pos_msnbase
compound_name_preparelibrary_pos_msnbase
mz_col_preparelibrary_pos_msnbase

Steps to reproduce

Sorry the data cannot be given to reproduce the error :(

Command line: nextflow run metaboigniter/main.nf -c metaboigniter/conf/custom.config -profile singularity
Log file:
log.txt (.nextflow.log renamed in log.txt)
System:

Hardware: HPC
Executor: slurm
OS: CentOS Linux
Version: 7

Nextflow Installation:

Version: 20.10.0

Container engine:

Engine: Singularity
version: 3.7.4-1.el7
Image tag: nfcore/metaboigniter:1.0.1

Errors

I found that when we set the parameters (these following four with values different than default) :

raw_file_name_preparelibrary_pos_msnbase = 'RAW_FILE'
compund_id_preparelibrary_pos_msnbase = 'IARC_ID'
compound_name_preparelibrary_pos_msnbase = 'NAME'
mz_col_preparelibrary_pos_msnbase = 'MZ'

the process process_create_library_pos_msnbase fails with the error :

Loading required package: stringr
  Error in `[.data.frame`(libraryInfo, , requiredHeader["mzCol"]) : 
    undefined columns selected
  Calls: createLibrary -> IntervalMerge -> [ -> [.data.frame
  Execution halted

When we set the parameter for mz column to default ‘mz’ but the other three to values different than default :

raw_file_name_preparelibrary_pos_msnbase = 'RAW_FILE'
compund_id_preparelibrary_pos_msnbase = 'IARC_ID'
compound_name_preparelibrary_pos_msnbase = 'NAME'
mz_col_preparelibrary_pos_msnbase = 'mz'

the process process_create_library_pos_msnbase also fails but with a different error :

  Loading required package: stringr
  Error in data.frame(startRT = startRT, endRT = endRT, startMZ = startMZ,  : 
    arguments imply differing number of rows: 1, 0
  Calls: createLibrary -> IntervalMerge -> data.frame
  Execution halted

When we set all the four parameters to their default values :

raw_file_name_preparelibrary_pos_msnbase = ‘rawFile’
compound_id_preparelibrary_pos_msnbase = ‘HMDB.YMDB.ID’
compound_name_preparelibrary_pos_msnbase = ‘PRIMARY_NAME’
mz_col_preparelibrary_pos_msnbase = ‘mz’

the process succeeds for a few tasks (for a few identification file), but fails for others, giving the following error :

Loading required package: stringr
Error in strsplit(x = hitTMP[, "parentmzs"], split = ";", fixed = T)[[1]] : 
  subscript out of bounds
Calls: createLibrary
Execution halted

In the bin folder, I dug into the R scripts involved in the process process_create_library_pos_msnbase (createLibrary.R and createLibraryFun.R) and found that it is related to the dataframe MSlibrary in the script createLibraryFun.R. For the tasks failing, in the dataframe MSlibrary, the columns parentmzs, parentrts, parentInts and MS2s are empty, therefore the line MSlibrary[MSlibrary[,"MS2s"]!="",] returns an empty dataframe and further creates an empty hitTMP dataframe. While for tasks succeeding, these columns are not empty, giving further a non-empty hitTMP dataframe !

I still can’t understand what could have happened leading to this issue 😢

If you have any idea that would be great !
Once again thank you so much in advance for your answer 💪

negative run error

Description of the bug

i try a lot and always meet the same error message when run negative data

Command used and terminal output

Command error:
  Adding neutral: ---------- Adduct -----------------
  Charge: 0
  Amount: 1
  MassSingle: -18.0106
  Formula: H-2O-1
  log P: -2.99573
  
  Adding neutral: ---------- Adduct -----------------
  Charge: 0
  Amount: 1
  MassSingle: 46.0055
  Formula: C1H2O2
  log P: -0.693147
  
  MassExplainer table size: 4
  Error: Unexpected internal error (WARNING!!! implicit number of default adduct is negative!!! left:-1 right: -1
  )
  Generating Masses with threshold: -2.99573 ...
  done

Relevant files

No response

System information

No response

Add ThermoRawFileParser

Add ThermoRawFileParser so the file conversion is also part of the workflow.

Also, try to incorporate https://github.com/phnmnl/container-pwiz

METABOIGNITER: Migrate all docs to JSON parameter schema

Hi!

this is not necessarily an issue with the pipeline, but in order to streamline the documentation group next week for the hackathon, I'm opening issues in all repositories / pipeline repos that might need this update to switch from parameter docs to auto-generated documentation based on the JSON schema.

This will then supersede any further parameter documentation, thus making things a bit easier :-)

If this doesn't apply (anymore), please close the issue. Otherwise, I'm hoping to have some helping hands on this next week in the documentation team on Slack https://nfcore.slack.com/archives/C01QPMKBYNR

URGENT: pin nf-validation version

Description of the bug

To prevent breaking this pipeline in the near future, the nf-validation version should be pinned to version 1.1.3 like:

plugins {
    id '[email protected]'
}

Command used and terminal output

No response

Relevant files

No response

System information

No response

	rawFileName<-"rawFile"
	compundID<-"HMDB.YMDB.ID"
	compoundName<-"PRIMARY_NAME"
	mzCol<-"mz"

nf-core / metaboigniter Goto Github PK

metaboigniter's People

Contributors

Stargazers

Watchers

Forkers

metaboigniter's Issues

Description

Solution

Check Documentation

Description of the bug

Steps to reproduce

Expected behaviour

Log files

System

Nextflow Installation

Container engine

Additional context

Description of the bug

Steps to reproduce

Errors

Check Documentation

Description of the bug

Steps to reproduce

Expected behaviour

Log files

System

Nextflow Installation

Container engine

Additional context

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Steps to reproduce

Errors

Description of the bug

Command used and terminal output

Relevant files

System information

Description of the bug

Command used and terminal output

Relevant files

System information

Recommend Projects

Recommend Topics

Recommend Org