Giter VIP home page Giter VIP logo

clean_asv_data's People

Contributors

johnne avatar

Watchers

 avatar

clean_asv_data's Issues

cleaned_nochimera_cluster_taxonomy.tsv

Create script to generate

resolved taxonomic annotation for all clusters in the corresponding counts file.

I am suggesting a simple approach where we sort the cluster annotations according to the number of reads. By definition, our clusters will be uniformly annotated down to the family level. From the genus level, I suggest we accumulate annotations by read number until we reach some critical threshold, like 50%. If we have a single annotation then, we use that annotation for the cluster. Otherwise we use an annotation like “unresolved” to indicate that there is ambiguity in the annotation of the cluster at that level. I like to use a distinct term, because the majority-rule annotation may well be something with “_X” or “unclassified”.

For Arthropoda:

At 80%, there are 724, 1258 and 2269 unresolved clusters

asv_stats.tsv

Script to generate a file with:

all ASVs with the total number of reads and the total number of samples in which they occur (and possibly other stats).

cleaned_nochimera_swarm_clusters.tsv

Script to generate a file with:

a subset file of ASVs after removing clusters/ASVs unclassified at the family level (“_X” or “unclassified”), clusters with < 3 reads, and ASVs occurring in more than 20% of blanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.