Giter VIP home page Giter VIP logo

gene_loss's Introduction

#here are brief descriptions of the code and data associated with Woodruff 2019 "Patterns of putative gene loss suggest rampant developmental system drift in nematodes"
#if there are any questions about this please contact me at [email protected]

#these are versions of all retreived sequences and annotations
retrieved_genomes_versions.txt


#this renames protein set files and adds species-specific prefixes to gene ids
prepare_protein_sets.sh


#these scripts discard alternative splice variants aside from the longest isoform per gene
canonical_filter_elegans.py 
canonical_filter_remanei-latens.py 
canonical_filter_t.py


#these are the blastp commands generated by OrthoFinder used for orthology assignment
all_by_all_blastp_commands.txt


#this is the OrthoFinder command used for orthology assignment
orthofinder_command.txt


#this script retrieves orthogroups found in all species but one
get_species-specific_gene_losses.sh


#this script retrieved best reciprocal blast hits
reciprocal_blast_hits.py


#this script replaces OrthoFinder gene id's with assembly gene id's
replace_seq_ids.sh


#these are the best reciprocal blastp hits
best_reciprocal_blast_hits.zip


#this script extracts sequences from a multifasta file (not my original script)
fasta_filter.pl


#this was used to find tblastn hits outside of predicted coding regions
filter_tblastn_hits.sh


#these are the tblastn commands
tblastn_commands.txt


#these are the C. elegans constituents of lost orthogroups that were used to connect lost genes with WormBase and transcriptomic data
elegans_constituents_of_lost_orthogroups.zip


#this is the retreived WormBase Anatomy, Life Stage, Phenotype, Interaction, and Reference Count data
wormbase_simplemine.txt


#this command was used to find all Pfam domains in all C. elegans proteins 
hmmer_command.txt


#these are the Pfam domains associated with C. elegans proteins
elegans_domains.txt


#this counts the number of Pfam domains per C. elegans protein
domain_counts.R


#this was used to connect wormbase and boeck et al. 2016 transcriptomic data with putative lost genes
connect_lost_genes_wormbase_boeck_transcript.sh


#this is the final WormBase and transcriptomic data set
lost_genes_wormbase_boeck.tsv


#this was used to analyse the WormBase and transcriptome data
wormbase_boeck_analysis.R


#this was used to define essential, inessential, and genes without phenotypes among the lost genes
essential_genes.sh


#this is the list of essential phenotypes
essential_phenotypes.txt


#the number of species-specific lost orthologous groups across Caenorhabditis
all_caeno_og.tsv


#the number of species-specific lost orthologous groups across the Elegans group
elegans_group_og.tsv


#this makes figure 2
figure_2.R


#number of essential genes among the C. elegans constuents of lost orthologous groups across Caenorhabditis
elegans_constituents_of_lost_og_phen_data_all_caeno.tsv


#number of essential genes among the C. elegans constuents of lost orthologous groups across the Elegans group
elegans_constituents_of_lost_og_phen_data_elegans_group.tsv


#this makes figure 3
figure_3.R


#this makes the supplemental figures
supplemental_figures.R 


#this makes the data for supplemental figure 6
supplemental_figure_6.sh


#genome assembly metrics for the genomes used in this study
genome_statistics.tsv

gene_loss's People

Contributors

gcwoodruff avatar

Stargazers

zhenpeng yu avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.