Giter VIP home page Giter VIP logo

gnick18 / fungal_icsbgcs Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 1.0 200.24 MB

Reproducible code and data repository for the fungal ICS BGC prediction publication. GitHub repository is managed by Grant Nickles. For any data related questions email [email protected] or [email protected]. For publication questions reach out to the corresponding authors Dr. Drott at [email protected] or Dr. Keller at [email protected].

HTML 100.00%
bioinformatics fungi genome-mining natural-products

fungal_icsbgcs's Introduction

DOI

Peer Reviewed Publication_Link

Explanation of repository's organization:

Reproducible code and data repository for the fungal ICS BGC prediction publication. All of the core code used in the publication in addition to the raw prediction files can be found here.

The metadata for the genomes in addition to tables that include more information on select BGCs can be found in the Supplemental Information.

ICS_FungalTreeFigure

ReproducipleScripts.md

This markdown file contains all of the major scripts used within this publication. It is sectioned by different key chunks of code, and includes a key on the top of the file.

Folder: ICSBGCs

This folder stores all of the raw ICS BGC predictions in genbank, gff, and fasta format. The BGC predictions for each genome is organized in a nested folder structure by the the NCBI accession of the genome it was found in.

  • The genbank files are compatible with popular natural product programs such as BiG-SCAPE and clinker
  • The gff files contain the annotations for the BGC region
  • The fasta files contain the entire nucleotide sequence encapsulating each BGC's locus
  • NOTE: These are the unedited and unfiltered predictions generated from the ICS-BGC prediction pipeline. Information on specific gene cluster familes, or refined BGC predictions can be found in other folders, and in the supplemental information

Folder: Conserved Cores

This folder contains summary tables for the othologs and orthogroups found in each GCF (if analyzed for such patterns). Each GCF will have two folders:

  • ConservedOrthogroups_50 displays the orthologous gene groups found across all of the BGC predictions (as determined with Orthofinder)
    • The species with the gene are shown, along with what domains were detected in said protein. The domain predictions can be particually inaccurate for this table, as different species can have different protein domain predictions for the same orthologous gene
  • OrthologsFULLTable is the full summary table showing exactly which protein in a given BGC was sorted into a family of orthologous genes. If trying to match this table to the other table, look at the GroupName column. This value will match on each file.

Folder: Trees

This folder contains all of the key trees generated in this publications, each sorted into their own folder. When relavent, log files produced by IQTree that shows information such as the evolutionary model selected by ModelFinder and bash command is provided.

fungal_icsbgcs's People

Contributors

gnick18 avatar

Watchers

 avatar  avatar

Forkers

harrisonestes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.