Giter VIP home page Giter VIP logo

long_read_tools's Introduction

long-read-tools.org

long-read-tools.org

Project Status Lifecycle

A database of software tools for the analysis of long read sequencing data. To make it into the database software must be available for download and public use somewhere (CRAN, Bioconductor, PyPI, Conda, GitHub, Bitbucket, a private website etc). To view the database head to https://www.long-read-tools.org.

Purpose

This database is designed to be an overview of the currently available long read analysis software, it is unlikely to be 100% complete or accurate but will be updated as new software becomes available. If you notice a problem or would like to add something please make a pull request or open an issue.

Citation/s

If you find the database useful, please consider citing our publication in your work:

Amarasinghe, S.L., Ritchie, M.E., & Gouil, Q. long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data (2020). https://doi.org/10.1093/gigascience/giab003

A review on long read analysing tools:

Amarasinghe, S.L., Su, S., Dong, X. et al. Opportunities and challenges in long-read sequencing data analysis. Genome Biol 21, 30 (2020). https://doi.org/10.1186/s13059-020-1935-5

Structure

The main tools table has the following columns:

  • Name
  • Platform - Programming language or platform where it can be used
  • DOIs - Publication DOIs separated by semi-colons
  • PubDates - Publication dates separated with semi-colons. Preprints are marked with PREPRINT and will be updated when published.
  • Code - URL for publicly available code.
  • Description
  • License - Software license
  • Technologies in Focus - Long read sequencing technologies of the data available tools are developed for
  • Categories (Described below)

Categories

The categories are TRUE/FALSE columns on the lrs_tools_master.csv indicating if the software has a particular function. These are designed to be used as filters, for example when looking for software to accomplish a particular task. They are also the most likely to be inaccurate as software is frequently updated and it is hard to judge all the functions a package has without making significant use of it. You wil see that there some tools hve been reported for multiple categories. The categories are assigned based on whether the tool:

  • Alignment - Aligns long reads to a reference
  • Analysis Pipelines - Is a pipelines that include several tools
  • Basecalling - Detects of change of electrical current produced by ONT sequencers and translate it to a DNA sequence
  • Base Modification Detection - Identifies modifications to individual bases like 5-methylcytosine, 5-hydroxymethylcytosine, and N6-methyladenine in DNA sequences
  • Data Structures - Alternatie data formats to native long-read storing data formats such as fast5
  • Demultiplexing - Uses barcode or other information to know which sequences came from which samples in a pool of samples
  • Denovo Assembly - Assembles long reads
  • Error Correction And Polishing - Corrects the errors to improve the genome assembly or reads before assembly. Some use a hybrid method of using short reads to achieve long reads with high accuracy
  • Evaluating Exisiting Methods - Benchmarks and/or evaluates functionality of existing tools and/or generating synthetic long read datasets
  • Gap Filling - Improves existing assemblies based on localised alignment and assembly
  • Gene Expression Analysis - Tests of differential expression across samples
  • Generating Consensus Sequence - Generate a consensus sequence from the assembled reads
  • Isoform Detection - Identifies multiple isoforms encoded by a single gene due to alternative splicing
  • Long Read Overlapping - Finds pairs of reads that align to each other
  • Metagenomics - Is used for studying genetic material recovered directly from environmental samples
  • Normalisation - Removes unwanted variation that may affect results
  • Provide Summary Statistics - Provides statistics that could be looked at to evaluate the quality of data
  • Quality Checking - Provides a measure of the quality of the reads
  • Quality Filtering - Removes low quality reads based on a specified quality threshold
  • Quality Trimming - Removes low-quality reads
  • Read Quantification - Quantifies of expression from reads
  • RNA Structure - Identifies SHAPE modification using nanopore direct RNA sequencing
  • Simulators - Simulates a sequencing process and produce in-silico reads
  • SNP And Variant Analysis - Detects or uses variants
  • Suitable For Single Cell Experiments - Can be used for analysing/processing single-cell data generated by long read sequencing platforms
  • Tested On Human Data - Provides evidence in publications to have been successfully employed to analyse human data
  • Tested On Non Human Data - Provides evidence in publications to have been successfully employed to analyse non-human data
  • Visualisation - Visualises some aspect of long read data or analysis

Contributors

Thank you to everyone who has contributed to long-read-tools! Your efforts to build and improve this resource for the community are greatly appreciated!

The following people have made significant contributions to the long-read-tools database or website:

long_read_tools's People

Contributors

shaniamare avatar qgouil avatar lazappi avatar alexiswl avatar scottgigante avatar mritchie avatar seoldh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.