This project contains a Nextflow plugin called nf-provone
.
The plugin provides custom trace observer that produces provenance information when a Nextflow workflow is run.
Features, Use cases, and differences to nf-prov
nf-provone
uses the ProvONE model, a model that extends the W3C PROV data model.
The plugin produces provenance document in the JSON-LD format. The provenance document contains elements describing workflow processes, input/output data, and the user who ran the workflow. These elements are connected by relations, e.g. connecting a process with output data elements.
The information in the provenance document forms a graph that can be visualized and explored. The provenance data can be also converted to an RDF graph and stored Triple Stores like Apache Fuseki. SPARQL queries can then be used to ask provenance questions, like "How was file xyz created?" and "Show me all workflows I ran last week".
In contrast, nf-prov
produces BioCompute Objects. This is a shorthand for the IEEE 2791-2020 standard for Bioinformatics Analyses Generated by High-Throughput Sequencing (HTS) to facilitate communication.
- ProvoneObserver: creates provenance for the workflow execution
-
Visualization of provenance documents as interactive knowledge graph. I'm currently testing Cytoscape.js for this purpose here
-
Adding more useful provenance data like the software environment (which container/conda environment was used?).
-
Storing provenance documents in triple stores automatically.
To run your unit tests, run the following command in the project root directory (ie. where the file settings.gradle
is located):
./gradlew check
To build and test the plugin during development, configure a local Nextflow build with the following steps:
-
Clone the Nextflow repository in your computer into a sibling directory:
git clone --depth 1 https://github.com/nextflow-io/nextflow ../nextflow
-
Configure the plugin build to use the local Nextflow code:
echo "includeBuild('../nextflow')" >> settings.gradle
(Make sure to not add it more than once!)
-
Compile the plugin alongside the Nextflow code:
make compile
-
Run Nextflow with the plugin, using
./launch.sh
as a drop-in replacement for thenextflow
command, and adding the option-plugins nf-hello
to load the plugin:./launch.sh run nextflow-io/hello -plugins nf-hello
The plugin can be tested without using a local Nextflow build using the following steps:
- Build the plugin:
make buildPlugins
- Copy
build/plugins/<your-plugin>
to$HOME/.nextflow/plugins
- Create a pipeline that uses your plugin and run it:
nextflow run ./my-pipeline-script.nf