Giter VIP home page Giter VIP logo

mingle's People

Contributors

cs09g avatar donovan-h-parks avatar elfrouin avatar geronimp avatar wwood avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

geronimp elfrouin

mingle's Issues

Show orientation of pre- and post-genes

It would be helpful if the pre- and post-genes reported by mingle indicated their orientation on the contig. This would allow assessment of conserved gene order and orientation across genomes.

mingle blast mode can max out the number of hits in blast mode

@dparks1134 I'm guessing 500 is the number of blast hits rather than the number of hits in the database. Maybe just use a bigger number of hits returned by default?

uqbwoodc@mrca003:20150220:/srv/projects/abisko/shotgun_abundance/53_doe_talk_functional_gene_pca/bclA$ mingle blast -q AAN32623.1.fasta -o AAN32623.1.mingle --cpus 40
[2015-02-20 22:49:49] INFO: Reading metadata information from ARB GreenGenes file.
[2015-02-20 22:49:49] INFO: Reading taxonomy information from ARB GreenGenes file.
[2015-02-20 22:49:49] INFO:   Found records for 15134 genomes.
[2015-02-20 22:49:49] INFO: Identifying homologous genes using BLAST.
[2015-02-20 22:50:17] INFO:   Identified 500 homologous genes.
[2015-02-20 22:50:17] INFO: Extracting homologous sequences.
[2015-02-20 22:51:10] INFO: Creating GreenGenes-style file for ARB.
[2015-02-20 22:51:11] INFO: Inferring multiple sequence alignment.
[2015-02-20 22:53:20] INFO: Inferring gene tree.

Verify sequence names

There are a number of characters that are invalid within user supplied sequences. It would be good to do an initial check for these characters: ,;()|~.

Difficulties with identification of public genomes

Unfortunately, genomes have not been added to the genome tree database in a consistent matter. As such, the metadata is not entirely consistent and identification of public genomes using entry['core_list_status'] == 'public' may result in some public genomes being missed. If we appear to be missing some expected genomes when filtering on the public flag this may be why.

Should we just filter on 'A' vs. 'C' genomes? Or just let this go for now with the idea that the GTDB will improve with time?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.