Giter VIP home page Giter VIP logo

bedqc's Introduction

BEDQC

BEDQC is a tool to validate and profile genomic intervals in BED format.

Once you upload a file to BEDQC, it will automatically
compare your file to the BED specifications, including checking the delimiter and line endings. A count of intervals and their lengths will also tabulated and presented. No data is transferred from your local environment.

Once a BED file is uploaded, you can choose to perform more complex tasks on your BED file including calculating metrics based on the intervals and intersecting your file with various other annotations. For instance, since most annotations will not overlap gaps in the same genome build, you can use this tool to help determine your file's genome build comparing to gaps from various builds to find which has the fewest intersecting hits.

BED file specification.

This tool utilizes the biowasm project to execute bedtools in the browser.

bedqc's People

Contributors

arq5x avatar brwnj avatar robertaboukhalil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

unique379r

bedqc's Issues

Encode user input to avoid XSS

Hey folks, awesome tool!

It seems the app executes any javascript that might be found inside a BED file. For example, using the following BED file launches an alert box:

chr2<script>alert('hello!')</script>	789	1020

I don't think it's a security issue since the data is local/there's no auth in your app, so feel free to close, but just FYI :)

Support for user uploaded genome files

To support other species, we will need to provide other species genome builds (chrom and length files) on the interface, but also want to allow the user to upload their own for generality.

Potential issue in identifying inconsistent delimiters

I think this is a neat tool and was just trying it out today. I am curious about one potential issue. I've noticed if the first 3 fields are separated by tab but any subsequent fields are separated by another delimiter (e.g. space) then bedqc will give green checks all around but incorrectly identify the file as having 3 fields. Files where the first 4 fields are delimited by tabs but with space delimitation afterwards seem to run through bedtools intersect (and I assume other functions) fine so this may rarely be an issue but I assume inconsistent delimitation of this type may happen often (like appending extra columns to a bed file) and it may help for bedqc to explicitly report something like this. I apologize if I'm simply missing something though or if this is a trivial point.

Overlap counts not working

When intersecting refseq in GRCh37 with the Ensembl whole gene track, here is what I get (1 hit on chr1)
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.