Giter VIP home page Giter VIP logo

samandmac / phylobuild Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 189.08 MB

This is a pipeline that can be used to generate a phylogenetic tree, including a heatmap showing carriage of specific genes, the assigned phylogroup of each strain, and the names of each strain. It should now be capable of working with any sort of strains - currently I'm working on making small improvements.

Perl 3.00% Shell 52.78% R 25.42% Python 18.79%
bioinformatics ecoli phylogenetic-trees phylogeny phylogroup

phylobuild's Introduction

  • ๐Ÿ‘‹ Hi, Iโ€™m SA Mac!
  • ๐Ÿ‘€ Iโ€™m interested in developing on my coding experience, with a focus on bioinformatics but also learning new programming languages!
  • ๐ŸŒฑ Iโ€™m currently learning Python, Linux, Java, and more about Bioinformatics tools associated with these languages
  • ๐Ÿ’ž๏ธ Iโ€™m looking to collaborate on anything, whether that be me providing assistance (so I can learn more!) or getting assistance from others (also to learn more!)
  • ๐Ÿ“ซ Contact me on here or by email!

phylobuild's People

Contributors

samandmac avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

phylobuild's Issues

Possible issues with PhyloGenes on macOS

So I've encountered two problems with users who use macOS:

  • SED error when running PhyloGenes
  • The "if" condition in the loop for some reason isn't working, even though that comparison should

To bypass these issues I recommend these solutions, respectively:

  • Download gnu-sed brew install gnu-sed and replace all the "sed" in PhyloGenes.sh with "gsed".
  • Comment out lines 189 - 196 in PhyloGenes.sh - this ignores deletion of files in the PhyloGenes folder. From here, run each part of the if condition manually:
blastn -query /Users/you/Documents/PhyloTree-main/PhyloGenes/GeneListX.txt -subject /Users/you/Documents/PhyloTree-main/Tree_Genomes/$W.fasta -qcov_hsp_perc 80 -perc_identity 70 -outfmt "6 qseqid pident" | gsed 's/^\(.\{0\}\)/\1>/' | awk '!seen[$0]++' | sort -k 2n  > Z_Last_File.txt #X in GeneListX.txt should be the number of genomes. E.g. if you have 88 strains, you should use GeneList88.txt

head Z_Last_File.txt | awk '{print $1}' > Z_Ortho_Names.txt
		
grep -Fwf Z_Ortho_Names.txt singleLineGenes.txt | gsed 's/[[:blank:]]*\([^[:blank:]]*\)$/\n\1/' > Z_Orthologs.txt

gsed '/^>/ s/^.*gene=\([Aa-Zz]\+\).*/\1/' Z_Orthologs.txt | gsed '1~2s/^/>/' > /Users/you/Documents/PhyloTree-main/Tree_Genes/geneList.txt

Remember to swap your path with the path used in the above example.

I'd appreciate any input from Mac user's as to why the comparison in the IF statement in PhyloGenes.sh isn't working, and workarounds for the SED, given that SED differs between Linux and Mac.

Possible SED issue with Windows 11 (Ubuntu on terminal)

There's the possibility that running PhyloGenes on a Windows 11 (or perhaps 10) terminal using Ubuntu it may return a SED error - if this occurs, the fix is to go into the script for PhyloGenes.sh and change line 163 from:

sed '/^>/ s/^.*gene=\([Aa-Zz]\+\).*/\1/' Z_Orthologs.txt | sed '1~2s/^/>/' > $genesForTree/geneList.txt

to:

sed '/^>/ s/^.*gene=\([A-z]\+\)].*/\1/' Z_Orthologs.txt | sed '1~2s/^/>/' > $genesForTree/geneList.txt

And it should work fine. Remember to alter the line back to the original if switching to anything other than Windows Terminal.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.