Giter VIP home page Giter VIP logo

patroon's Introduction

patRoon

CircleCI Build status codecov Docker image DOI:10.1186/s13321-020-00477-w DOI

patRoon aims to provide comprehensive mass spectrometry based non-target analysis (NTA) workflows for environmental analysis. The name is derived from a Dutch word that means pattern and may also be an acronym for hyPhenated mAss specTROmetry nOn-target aNalysis.

Project news

May 2022 patRoon 2.1 is now available. This new release integrates prediction of transformation products with CTS, adds several feature intensity normalization methods, adds new functionality and improvements for reporting TP data and supports loading, processing and annotation with MS libraries such as MassBank. Please see the Project NEWS for details.

Introduction

Mass spectrometry based non-target analysis is used to screen large numbers of chemicals simultaneously. For this purpose, high resolution mass spectrometry instruments are used which are typically coupled (or hyphenated) with chromatography (e.g. LC or GC). The size and complexity of resulting data makes manual processing impractical. Many software tools were/are developed to facilitate a more automated approach. However, these tools are generally not optimized for environmental workflows and/or only implement parts of the functionality required.

patRoon combines established software tools with novel functionality in order to provide comprehensive NTA workflows. The different algorithms are provided through a consistent interface, which removes the need to know all the details of each individual software tool and performing tedious data conversions during the workflow. The table below outlines the major functionality of patRoon.

Functionality Description Algorithms
Raw data pre-treatment MS format conversion (e.g. vendor to mzML) and calibration. ProteoWizard, OpenMS, DataAnalysis
Feature extraction Finding features and grouping them across analyses. XCMS, OpenMS, enviPick, DataAnalysis, KPIC2, SIRIUS, SAFD
Suspect screening Finding features with suspected presence by MS and chromatographic data. Estimation of identification confidence levels. Native
MS data extraction Automatic extraction and averaging of feature MS(/MS) peak lists. Native, mzR, DataAnalysis
Formula annotation Automatic calculation of formula candidates for features. GenForm, SIRIUS, DataAnalysis
Compound annotation Automatic (in silico) compound annotation of features. MetFrag, SIRIUS, Native
Componentization & adduct annotation Grouping of related features based on chemistry (e.g. isotopes, adducts and homologs), hierarchical clustering or MS/MS similarity into components. Using adduct and isotope annotations for prioritizing features and improving formula/compound annotations. RAMClustR, CAMERA, nontarget R package, OpenMS, cliqueMS, Native
Combining algorithms Combine data from different algorithms (e.g. features, annotations) and generate a consensus. Native
Sets workflows Simultaneous processing and combining +/- MS ionization data Native
Transformation product (TP) screening Automatic screening of TPs using library/in-silico data, MS similarities and classifications. Tools to improve compound TP annotation. BioTransformer, PubChemLite, Native
Reporting Automatic reporting in CSV, PDF and (interactive) HTML formats. An example HTML report can be viewed here. Native
Data clean-up & prioritization Filters for blanks, replicates, intensity thresholds, neutral losses, annotation scores, identification levels and many more. Native
Data curation Several graphical interactive tools and functions to inspect and remove unwanted data. Native

The workflow of non-target analysis typically depends on the aims and requirements of the study and the instrumentation and methodology used for sample analysis. For this reason, patRoon does not enforce a certain workflow. Instead, most workflow steps are optional, fully configurable and algorithms can easily be mixed or even combined.

Implementation details

  • patRoon is implemented as an R package, which allows easy interfacing with the many other R based MS tools and other data processing functionality from R.
  • Fully open-source (GPLv3).
  • Developed on Windows, Linux and macOS
  • S4 classes and generics are used to implement a consistent interface to all supported algorithms.
  • Continuous integration is used for automated unit testing, automatically updating the Website and documentation and maintaining a miniCRAN repository and Docker image to simplify installation (see the handbook for more details).
  • Supports all major instrument vendor input formats (through usage of ProteoWizard and DataAnalysis).
  • Optimizations
    • data.table is used internally as a generally much more efficient alternative to data.frame.
    • The processx and future R packages are used for parallelization.
    • Results from workflow steps are cached within a SQLite database to avoid repeated computations.
    • Code for loading MS and EIC data, MS similarity calculations and others were implemented in C++ to reduce computational times.
  • The RDCOMClient R package is used to interface with Bruker DataAnalysis algorithms.
  • The Shiny R package was used to implement several GUI tools.

Installation

patRoon itself can be installed as any other R package, however, some additional installation steps are needed to install its dependencies. Alternatively, R Studio based Docker images are available to easily deploy a complete patRoon environment. Please see the installation section in the handbook for more information.

Getting started

For a very quick start:

library(patRoon)
newProject()

The newProject() function will pop-up a dialog screen (requires R Studio), which will allow you to quickly select the analyses and common workflow options to subsequently generate a template R processing script.

However, for a better guide to get started it is recommended to read the tutorial. Afterwards the handbook is a recommended read if you want to know more about advanced usage of patRoon. Finally, the reference outlines all the details of the patRoon package.

Citing

When you use patRoon please cite its publications:

Rick Helmus, Thomas L. ter Laak, Annemarie P. van Wezel, Pim de Voogt and Emma L. Schymanski. patRoon: open source software platform for environmental mass spectrometry based non-target screening. Journal of Cheminformatics 13, 1 (2021)

Rick Helmus, Bas van de Velde, Andrea M. Brunner, Thomas L. ter Laak, Annemarie P. van Wezel and Emma L. Schymanski. patRoon 2.0: Improved non-target analysis workflows including automated transformation product screening. Journal of Open Source Software, 7(71), 4029

patRoon builds on many open-source software tools and open data sources. Therefore, it is important to also cite their work when using these algorithms via patRoon.

Contributing

For bug reports, code contributions (pull requests), questions, suggestions and general feedback please use the GitHub page.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.