Giter VIP home page Giter VIP logo

pipeline-resources's Issues

Definition of how to meet and verify standards

The pipeline standards descriptions are fantastic, but could be improved by explicitly stating exactly how the standard can be met and how it can be verified that the standard has in fact been met.

Here I propose for each standard including 2 additional statements:
"To meet this standard..."
"To verify this standard..."

This will clear up ambiguities for both developers and reviewers. The goal is to define what pieces of information must be available to reviewers to receive verification of standards being met.

Observations about Proposed Standards for Public Health Bioinformatics Software

My group and I have been reviewing the Proposed Standards for Public Health Bioinformatics Software document from the perspective of a team dedicated to the development of analysis pipelines, and we have some observations, from our humble opinion and experience, about the document, that we hope could help in its development.

We believe it needs to be clearly defined whether these are minimum requirements, best practices, or guidelines, something we think its already under discussion in the meetings. We also think it should be clarified whether these are standards for pipelines or software, as some points may not apply to pipelines, and reversal.

We also believe that, in addition to indicating how this is going to be evaluated as Frank is working in his PR (#32), it could be useful to provide another section per point with resources, such as links or documents, that can assist developers with each of the standards.

Next, I will describe our observations on some of the points:

  • Version Control: Links like https://semver.org/spec/v2.0.0.html and https://keepachangelog.com/en/1.0.0/ could be added regarding the CHANGELOG.
  • Commitment to Maintain: Perhaps it could be replaced with a section like "Maintenance Capability," as even if the commitment to maintain exists, it may not be fulfilled due to external factors or bad faith. Perhaps it is sufficient with the description/demonstration of how it will be maintained, having it considered in the README and contemplated in the pipeline's background (research project, master project, community...).
  • Documentation for Local Installation and/or Remote Access (e.g. Web Server or Galaxy/Terra Workflow): Some recommendations like pip, conda, etc., could be included for installation.
  • Software Performance: We do not believe that documenting this should be a minimum for a pipeline. Besides, it is relatively difficult for a small group of developers working alone, and it seems more like obtaining feedback from other groups or users.
  • Common File Formats: Is this going to be reviewed with a list of formats? A list would need to be provided in the standard's definition but alse kept up to date over time, which seems complicated but also relevant.
  • Software Security and Vulnerabilities: We do not see this as necessary for a pipeline. Perhaps for a website or database, but even then, in many cases, it is carried out by the security department of the institution externally to the code itself.

Here is a just proposal of reorganization to reduce the list to 10, which I believe was one of the next objectives:

  1. Publicly-Accessible Repository
  2. Version Control
  3. Pipeline Documentation
    • Open-Source License
    • Contribution, Authorship, and Verified Point of Contact
    • Maintenance Capability
    • Conflict of Interest Statement
  4. Pipeline Guidelines
    • Documentation for Local Installation and/or Remote Access
    • Software Functionality
    • Statement of Need with Respect to Public Health Pathogen Genomics
    • Example Usage
    • Container/Packaged Software
  5. Software Testing
  6. Community Guidelines for Contribution and Support
  7. Benchmark/Validation Datasets
  8. Common File Formats
  9. Reference Data Requirements

HIV Guidance Document - Workflows for HIV analysis

Many tools exist for HIV analysis. Recently, the advent of workflow managers has lead to a large number of analysis workflows comprised of numerous tools. a description of these workflows, and the internal components would be helpful for newer HIV bioinformatics scientists.

Include MicrobeTrace as a bioinformatics solution for Mpox

Please include MicrobeTrace (https://microbetrace.cdc.gov/MicrobeTrace/) as a bioinformatics solution for Mpox virus (mpxv) genomic analysis. Our MicrobeTrace team at CDC recommends that users load a distance matrix or distance-based edge list in CSV format. MicrobeTrace will perform better with edge lists when compared to loading large genome sequence alignments in fasta format. We're also implementing code to generate distance-based edge lists from the Nextstrain tree output.

Contact our team for a consultation or technical support at [email protected].

HIV Guidance Doc

๐Ÿ“Œ Explain the Request

There exists a need to encourage burgeoning bioinformaticians to pursue the challenges presented by HIV genomics. Here, we present a guidance document designed to help public health scientists interested in performing genomic characterization, antiviral resistance detection, phylogenetic tree construction, and genomic epidemiological investigation involving HIV Next Generation Sequencing data.

๐Ÿ“š Context

There exists a need to encourage burgeoning bioinformaticians to pursue the challenges presented by HIV genomics.

๐Ÿ“ˆ Desired Contribution

We request a guidance document designed to help public health scientists interested in performing genomic characterization, antiviral resistance detection, phylogenetic tree construction, and genomic epidemiological investigation involving HIV Next Generation Sequencing data.

โ„น๏ธ Additional Information

Please include descriptions of common open-source bioinformatics software as well as case study examples.

TB Reference Compilation

In the TB Guidance document, references are currently listed in comments. We would like to compile them into a ref section and add ref indicators in the body of the doc.

Discuss masking and handling of low quality data

Primer erosion with omicron is showing itself to be fairly important. In some cases, masking problematic amplicons may help improve the interpretability of the data. Is it worth bringing that out here? Hard to without a fairly detailed tour through the various amplicon strategies and how they're faring.

Influenza Guidance Document - Request for feedback

๐Ÿ“Œ Explain the Request
Reformat the doc where it is needed. The doc is a living doc so don't hesitate to propose changes.

Additional Information
If you think that some sections need review and editing please do so.

Issue and Pull Request Submission Guide

Add a guide to submit issues and pull requests as part of the document maintenance framework.

Issues can be submitted in order to request changes or additions to the documents and should be tagged with the documents they are referring to.
Pull Requests will allow for change tracking and collaborative review of any changes implemented into any document. The PRs should also be tagged with the issue they are "fixing" and they should reference the document under modification in the PR title.

This guide should be added to the README of the pipeline-solutions main page.

Issue and PR templates would be a nice addition.

HIV Guidance Document - Add references for Bioinformatics Tools

In the HIV guidance document there are a number of bioinformatics tools detailed in their application to analyze HIV genomes. Each of these tools should have a reference whether it be a publication or github page.

In this issue we request that each tool be given a reference with link.

HIV Guidance Doc - Markdown formatting of Case Studies Section

๐Ÿ“Œ Explain the Request

Reformat the case studies section of the HIV guidance document to be more reader-friendly. Add section breaks and titles to each case study.

๐Ÿ“š Context

Currently the case studies appear to all be part of the same section and they are not easily distinguishable by the reader.

๐Ÿ“ˆ Desired Contribution

A fork of the main PHA4GE repo including Markdown changes that reformat this section into a more readable version.

โ„น๏ธ Additional Information

No new section need to be added, just reformatting.

HIV Guidance Document - Add a Table of Contents

๐Ÿ“Œ Explain the Request

We would like a table of contents added to the HIV Guidance document with links to all sections.

๐Ÿ“š Context

Currently the document is a bit difficult to navigate for a newer github user, so a table of contents would enhance accessibility.

๐Ÿ“ˆ Desired Contribution

A fork of the main PHA4GE repo including Markdown code adding a table of contents to the HIV Guidance Document.

โ„น๏ธ Additional Information

https://github.blog/changelog/2021-04-13-table-of-contents-support-in-markdown-files/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.